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ABSTRACT 


Security  Information  and  Event  Management  (SIEM)  solutions  are  a  critical  infonnation 
systems  security  control  for  monitoring,  assessing,  and  reacting  to  cyber  threats  in  near 
real-time.  A  given  SIEM  solution,  however,  is  not  a  simple  plug-and-play,  drop-in, 
security  device.  On  the  contrary,  a  successful  implementation  requires  configuration 
tailored  to  the  specifics  of  a  target  network,  as  well  as  operators  who  are  very 
knowledgeable  of  both  the  SIEM’s  functionality  and  the  characteristics  of  network/data- 
center  events. 

This  thesis  will  lay  the  framework  for  SIEM  operator  education  via  use  of  pre¬ 
captured  network/data-center  events  (i.e.,  network  traffic  and  device  log  information). 
The  desired  outcome  is  a  repeatable  framework  that  can  be  utilized  by  organizations 
interested  in  deploying  more  technically  savvy  SIEM  operators.  The  framework  will  be 
empirically  demonstrated  with  a  SIEM  learning  lab  developed  for  HP’s  ArcSight  SIEM. 
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I.  INTRODUCTION 


A.  THESIS  STATEMENT 

The  purpose  of  this  thesis  is  to  develop  a  repeatable  Security  Information  &  Event 
Management  (SIEM)  educational  framework  that  can  be  used  to  train  and  educate  SIEM 
operators  effectively.  The  framework  will  be  empirically  demonstrated  with  a  scenario- 
based  training  lab  that  analyzes  pre-captured  replay  events  (e.g.,  network  traffic  and 
device  log  information)  using  HP  ArcSight  Enterprise  Security  Management  (ESM)  [l].1 
Through  the  lab  sessions,  operators  will  explore  and  understand  the  various  functional 
components  and  operational  processes  of  ArcSight,  and  work  to  develop  a  more  optimal 
SIEM  configuration  solution  for  the  target  scenario  represented  by  its  pre-captured  replay 
events. 


To  achieve  the  primary  thesis  objective,  the  research  needs  to  answer  the 
following  questions: 

1 .  What  are  the  important  learning  elements  that  must  be  encapsulated  within 
the  SIEM  educational  framework  so  as  to  provide  a  cyber-forensic 
methodology  for  a  SIEM  operator? 

2.  What  information  within  the  pre-captured  replay  events  is  required  for 
informative  forensic  analysis? 

3.  How  does  the  correlation  engine  within  ArcSight  ESM  perform  to  identify 
the  threat  relationships  between  events? 

4.  What  is  a  more  optimal  SIEM  configuration  (e.g.,  rule  filters  and  asset 
characterizations)  that  will  result  in  the  highest  likelihood  of  threat 
detection? 

5.  How  can  the  SIEM  solution  be  generalized  into  an  instructional 
methodology  that  can  be  demonstrated  in  a  SIEM  learning  lab? 


1.  ArcSight  ESM  is  a  comprehensive  SIEM  software  solution  that  combines  traditional  security  event 
monitoring  with  network  intelligence,  context  correlation,  anomaly  detection,  historical  analysis  tools  and 
automated  remediation. 
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B.  THESIS  SCOPE  AND  ORGANIZATION 

This  thesis  will  focus  on  three  main  areas: 

1.  Review  Research  on  Currently  Deployed  SIEM  solutions 

The  first  phase  will  explain  the  role  that  SIEM  solutions  play  in  the 
overall,  defense-in-depth,  security  posture  of  a  well-defended  data  center. 
A  short  overview  of  worldwide  deployment  will  provide  some  motivation 
regarding  the  growing  adoption  of  these  types  of  security  controls  for 
providing  cyber  situational  awareness  in  data  centers.  In  particular,  we 
examine  the  significance  of  SIEM  solutions  in  environments  where  the 
volume  of  events  exceeds  the  ability  of  personnel  to  monitor  or  analyze 
these  events  manually. 

2.  Analysis  of  the  Functional  Components  within  ArcSight 

The  next  phase  will  focus  on  studying  the  entire  security  event  life  cycle 
that  starts  with  event  collection  and  continues  through  event  correlation 
and  storage.  This  will  involve  the  analysis  of  the  functionality  of  the  core 
components  and  workflow  structure  within  ArcSight  to  understand  its 
operational  processes  and  how  they  combine  to  bring  a  valued-added 
solution  to  detection  analysis  in  a  security  operations  center  (SOC). 

3.  Development  of  Training  Scenarios  Based  on  Pre-captured  Replay 
Events 

Pre-captured  replay  events  are  essentially  a  set  of  logged  records  of 
historic  cyber  activities  which  (by  design)  represent  both  nonnal  (i.e., 
benign)  and  malicious  actions.  In  this  phase,  a  cyber-forensic 
methodology  will  be  designed  and  incorporated  within  the  training  lab  to 
edify  SIEM  operators  regarding  how  configuration  changes  to  a  SIEM’s 
various  settings  result  in  a  higher  signal-to-noise  ratio  (S/N)  regarding  the 
identification  of  malicious  events  among  the  background  noise  of 
“normal”  events. 
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This  thesis  consists  of  six  chapters: 

•  The  thesis  objective  and  the  procedural  approach  used  in  achieving  the 
objective  are  described  in  Chapter  I. 

•  The  purpose  of  this  thesis  as  it  presents  the  impact  of  cyber-attacks,  as 
well  as  the  complexities  and  challenges  in  implementing  a  SIEM  solution 
are  highlighted  in  Chapter  II.  Also  included  in  this  chapter  is  an 
explanation  of  the  need  to  derive  an  effective  training  methodology  for 
SIEM  operators.  The  fundamental  principles  of  SIEM  operation  and  its 
evolution  are  also  examined. 

•  The  system  anatomy  of  ArcSight  is  described  in  Chapter  III  along  with  an 
overview  of  the  functionalities  of  its  core  components.  This  will  allow 
readers  to  understand  the  operational  processes  within  ArcSight  which 
serve  to  automate  the  detection,  and  optionally  the  reaction,  to  malicious 
actions  taken  against  a  data  center. 

•  The  strategy  for  deriving  the  SIEM  training  methodology  and  educational 
framework  is  discussed  in  Chapter  IV.  The  tools,  learning  goals  and 
method  used  in  developing  a  scenario  are  also  presented. 

•  The  learning  scenario  is  described  in  Chapter  V  by  analyzing  the  replay 
events  and  subsequently  deriving  a  more  optimal  SIEM  solution.  It 
includes  a  scenario  brief  to  SIEM  operators  and  expected  results. 

•  A  summary  of  the  research  is  included  in  Chapter  VI  along  with 
recommendations  for  future  work 
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II.  BACKGROUND 


A.  IMPACT  OF  CYBER-ATTACKS 

In  a  network-centric  modern  society,  the  increasing  operational  “footprint”  of 
computer-based  systems,  along  with  the  rewards  for  subverting/exploiting  the 
information  they  process,  store,  or  transport,  leads  to  a  rising  occurrence  of  cyber-attacks. 
Communication  across  vast  distances  is  now  quicker  and  easier  with  the  advancement  in 
mobile  computing  devices  and  more  ubiquitous  connectivity  and  bandwidth.  As  a  result, 
more  people  and  business  organizations  are  connected  virtually,  which  in  turn  increases 
the  amount  of  sensitive  and  private  information  flowing  in  the  virtual  space.  Based  on 
statistical  results  from  the  Internet  World  Stats  [1],  a  566.4%  growth  in  the  number  of 
Internet  users  around  the  world  from  2000  to  2012  has  been  recorded.  This  explosive 
growth  clearly  indicates  the  pervasiveness  of  the  Internet.  In  addition,  the  ever-growing 
reliance  on  Internet  technology  as  the  primary  medium  for  our  daily  services  (e.g., 
business  operations,  information  sources,  etc.),  has  further  extended  the  damage  that  can 
be  incurred  by  an  individual  or  organization  due  to  cyber-attacks.  In  an  article  compiled 
by  Business  Roundtable,2  named  “Growing  Business  Dependence  on  the  Internet”  [2], 
the  economic  consequences  and  adverse  impact  on  the  companies  was  highlighted,  as 
they  depend  more  on  the  Internet  to  help  them  conduct  worldwide  business  operations. 
Within  the  article,  the  following  is  quoted: 

According  to  the  World  Economic  Forum,  a  breakdown  of  the  Critical 
Information  Infrastructure  (CII)  is  one  of  the  core  risks  facing  the 
international  economy.  The  World  Economic  Forum  estimates  that  there  is 
a  10  to  20  percent  probability  of  a  CII  breakdown  in  the  next  10  years 
(Figure  1.  ),  one  of  the  highest  likelihood  estimates  of  the  23  global  risks  it 
examined  in  a  recent  report.  The  report  estimates  the  global  economic  cost 
of  the  incident  at  approximately  $250  billion,  or  more  costly  than  two- 
thirds  of  the  [all  others  included  in  the  report]  risks.  [2] 


2.  Business  Roundtable  (BRT)  is  an  association  of  chief  executive  officers  of  leading  U.S.  companies 
with  more  than  $7.3  trillion  in  annual  revenues  and  nearly  16  million  employees 
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Increasing  consensus  around  risk 
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Figure  1.  23  Core  global  risks:  Likelihood  with  severity  by  economic  loss.  From  [2]. 


Based  on  the  World  Economic  Forum’s  estimate,  it  can  be  seen  that  the  global 
economy  relies  heavily  on  CII,  which  is  essentially  the  Internet.  While  natural  disasters 
are  also  mentioned  as  a  potential  threat  to  the  Internet,  the  article  gave  greater  emphasis 
to  cyber-attacks  and  highlighted  them  as  the  main  threat  behind  various  crisis  scenarios. 
This  emphasis  is  understandable  given  the  increasing  amount  of  cyber-attacks  faced  by 
many  business  organizations  on  a  daily  basis. 

In  the  military  domain,  cyber-attacks  have  been  treated  with  a  more  serious  tone, 
being  labeled  as  “cyber-warfare”  or  “cyber-terrorism.”  Both  terms  explicitly  describe  the 
severe  impact  on  the  national  security  such  attacks  would  have  if  they  were  to  succeed. 
Many  military  intelligence  agencies  around  the  world  have  compared  conventional 
warfare  with  cyber-warfare,  revealing  that  the  latter  can  inflict  worse  damage  with  much 
less  effort.  This  is  due  in  part  to  the  increase  in  online  (i.e.,  remote  via  the  CII)  control  of 
critical  national  infrastructures  such  as  power  stations,  transportation  systems  and  oil 
refineries.  Crippling  such  key  installations  can  generate  cascading  effects  that  will  not 
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only  disrupt  daily  activities,  but  also  create  mass  damage  that  may  result  in  catastrophic 
loss  of  life  and  destruction  of  property. 

B.  COUNTERING  CYBER-ATTACKS  WITH  SIEM 

Over  the  past  decade,  the  aforementioned  threats  have  forced  cyber  defenders  to 
continuously  invent  protection  mechanisms  to  mitigate  cyber- attacks.  A  SIEM  product  is 
one  such  mitigation  solution.  Simply  stated,  a  SIEM  product  is  an  infonnation  system 
security  control  employed  for  monitoring,  assessing  and  reacting  to  cyber-attacks  in  real¬ 
time,  or  at  least  much  more  quickly  than  if  any  of  the  entailed  processes  were  done 
manually. 
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Log  Management 
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Monitortradtion&l 
security  telemetry 
Msiblity  into  servers 
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Applications 
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Security  Intelligence  Platform 


Figure  2.  Modern  evolution  of  SIEM.  From  [3]. 


The  current  state  of  SIEM  products  has  evolved  from  two  formerly  disparate 
product  categories;  SIM  (Security  Information  Management)  and  SEM  (Security  Event 
Management)  (Figure  2.  ).  While  SIM  products  were  used  primarily  to  gather  and  create 
reports  from  security  logs,  SEM  products  utilized  event  correlation  and  alerting  functions 
to  help  with  analysis  and  incident  response.  On  the  surface,  both  product  classes 
possessed  fundamental  differences  in  their  capabilities  to  manage  a  torrent  of  system 
events  and  enforce  the  desired  security  observations.  However,  in-depth  analysis  reveals 
various  commonalities  between  these  product  classes,  such  as  their  workflow  and 
reporting  processes.  Each  product  class  provided  security  features  that  can  be 
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complemented  by  the  other  to  provide  a  more  holistic  approach  in  countering  cyber¬ 
attacks.  As  such,  their  functions  were  combined  and  integrated  into  a  single  security 
management  system,  known  as  a  SIEM. 

Most  SIEM  products  work  by  deploying  multiple  collection  agents  and 
connecting  them  with  end-users’  devices,  such  as  servers  and  firewalls,  so  as  to  gather 
many  diverse  security-related  events.  The  collectors  then  normalize  the  events  to  a 
standard  format  before  forwarding  them  to  a  centralized  management  console,  which 
performs  inspection  and  flags  identified  anomalies.  By  anomalies  here,  we  mean  any 
events  that  solely  or  in  toto  represent  either  a  realized  or  pending  threat  to  the  protected 
system  (network,  data-center,  etc.).  The  process  of  identifying  anomalies  is  usually 
handled  by  either  a  rule-based  or  statistical  correlation  engine  which  establishes 
relationships  between  the  event  logs,  infers  the  significance  of  those  relationships  and 
prioritizes  them.  Once  flagged,  the  events  will  be  monitored  and  inspected  further  to 
determine  any  potential  or  realized  damage.  Optionally,  automated  actions  can  be  tied  to 
such  malicious  event  detections,  thus  providing  not  only  detection  capabilities  but 
reaction  capabilities  as  well. 

C.  COMPLEXITIES  AND  CHALLENGES  IN  IMPLEMENTING  SIEM 

The  underlying  principle  of  a  SIEM  product  is  that  relevant  system  data  about  an 
organization’s  security  is  generated  in  multiple  locations  and  collated  into  a  single  point 
of  view,  which  makes  it  easier  to  spot  the  trends  and  patterns  of  a  cyber-attack.  This  is 
supposed  to  make  analysis  and  response  easier  for  the  security  professionals.  However,  a 
SIEM  product  is  not  a  simple  plug-and-play,  drop-in,  security  device.  Based  on  a  recent 
survey  conducted  by  elQnetworks,  a  pioneer  in  simplified  security,  risk  and  compliance 
solutions;  it  was  revealed  that  organizations  are  having  difficulties  with  SIEM 
deployments  [4]. 

The  first  challenge  identified  was  the  long  deployment  time  required.  The  survey 
quoted  44%  of  the  respondents  reporting  that  it  took  a  few  weeks  to  more  than  a  month  to 
deploy  their  latest  SIEM  product.  This  reflects  that  the  products  require  fine-tuning  over 
time  to  get  real  security  value  (ROI)  and  can  take  months  to  obtain  useful  data. 
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The  second  challenge  was  attributed  to  the  lack  of  trained  personnel.  It  was  found 
that  the  complexities  involved  in  correlating  security  and  configuration  data  across  IT 
assets  and  deriving  an  adequate  set  of  security  controls  could  prove  to  be  too  technically 
complex  for  the  typical  IT  operator.  A  successful  implementation  requires  configuration 
tailored  to  the  specifics  of  the  target  network,  as  well  as  operators  who  possess  deep 
knowledge  on  both  the  SIEM’s  functionality  and  the  characteristics  of  network  events. 
This  is  a  very  broad  domain  of  knowledge  encompassing;  among  other  things,  operating 
systems,  applications,  log  formatting,  flows,  authentication,  protocol,  and  TCP/IP  stack 
understanding. 

The  last  challenge  was  associated  with  compliance  issues.  35%  of  the  respondents 
stated  that  compliance  requirements  were  the  primary  driver  behind  the  use  of  SIEM. 
Rather  than  focusing  on  the  security  deliverables  offered  by  the  SIEM  product,  some 
organizations  are  employing  them  only  to  satisfy  compliance  and  regulatory  requirements 
(e.g.,  PCI,  HIPAA,  and  GLBA).  As  such,  many  SIEM  products  were  reduced  in 
functionality  to  being  mere  electronic  organizers. 

From  the  challenges  and  complexities  stated,  it  is  clear  that  the  capabilities  of 
many  SIEM  deployments  have  been  marginalized  and  their  full  potential  in  providing  a 
more  secure  cyber  environment  has  yet  to  be  realized.  These  are  risk  factors  for 
organizations  which  have  invested  large  amounts  of  money  in  SIEM  products,  only  to 
receive  sub-optimal  protection  returns  on  this  investment.  When  configured,  deployed,  or 
operated  incorrectly,  SIEM  products  may  even  be  counter-productive  by  introducing 
added  complexity,  confusion,  and  service  disruption. 

D.  FRAMEWORK  FOR  SIEM  OPERATOR  EDUCATION 

In  truth,  SIEM  products  remain  a  state-of-the-art  technology  that  requires 
specialized  and  trained  personnel  to  set  up,  configure  and  manage  them.  When  an 
untrained  IT  operator  is  first  tasked  to  utilize  a  SIEM  product,  he/she  will  likely  be 
overwhelmed  by  the  multitude  of  configurations,  user-interfaces  and  processes  he/she 
must  understand  prior  to  successful  operation  of  the  product.  With  lack  of  a  systematic 
approach  and  detailed  understanding  of  the  workflow  processes  used  by  the  core 
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functionalities  in  a  SIEM  product,  the  IT  operator  will  be  set  back  for  many  weeks  in 
his/her  effort  to  implement  the  desired  protection.  To  ensure  that  IT  operators  possess 
adequate  knowledge  and  attain  the  required  operational  capability  within  a  short  time 
frame,  it  is  important  to  derive  an  educational  framework  that  can  be  used  in  a  repeatable 
manner.  This  will  provide  the  training  foundation  needed  to  increase  the  technical  level 
of  all  operators  in  utilizing  the  SIEM  product  effectively  to  mitigate  cyber-attacks. 
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III.  ARCSIGHT  ESM  ANATOMY 


In  this  chapter,  the  important  learning  elements  required  in  implementing  the 
ArcSight  ESM  are  explained.  Also  highlighted  are  the  various  functional  processes 
within  a  typical  SIEM  product. 

Prior  to  utilizing  ArcSight  ESM,  it  is  crucial  to  first  identify  and  understand  the 
functionalities  of  the  various  software/hardware  components  within  the  SIEM  framework 
(Figure  3).  These  are  the  tools  that  are  responsible  for  providing  the  cyber-forensic 
capabilities  that  will  be  used  by  SIEM  operators  to  collect,  analyze  and  detect  the 
malicious  events.  They  range  from  source  devices  (e.g.,  servers)  that  instantiate  the 
events  to  ArcSight  core  components.  The  workflow  structure  throughout  the  ESM  will 
then  be  explained  to  further  enhance  the  reader’s  understanding  of  the  operational 
processes  required  throughout  the  event  life  cycle. 
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Figure  3.  ArcSight  ESM’s  connectivity  diagram.  After  [5]. 
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A. 


COMPONENT  FUNCTIONALITIES 


1.  Network  Devices  and  Applications 

A  network  device  is  the  source  of  events.  It  is  a  network  node  that  has  a  physical 
processing  location  with  a  unique  network  address  (e.g.,  IP  address,  MAC  address)  and 
possesses  the  capability  to  recognize,  process  and  transmit  to  other  nodes.  The  events 
output  from  the  network  device  will  be  fed  into  the  ArcSight  SmartConnectors  to 
generate  correlation-relevant  data  on  the  network.  The  various  types  of  network  devices 
that  are  supported  by  ArcSight  ESM  are  shown  in  Figure  4. 
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Figure  4.  Network  devices  Supported  by  ArcSight  ESM.  From  [5]. 
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2.  ArcSight  SmartConnectors 

SmartConnectors  provide  the  interface  between  the  network  devices  and  the 
ArcSight  Express.  As  different  devices  have  varying  logging  formats  and  reporting 
mechanisms,  it  is  difficult  to  extract  information  for  querying  without  first  normalizing 
the  events.  Henceforth,  SmartConnectors  are  responsible  for  translating  a  multitude  of 
device  outputs  into  a  normalized  schema  that  becomes  the  starting  point  for  the 
correlation  process  done  by  the  ArcSight  Express.  After  collecting  the  event  data  from 
the  network  devices,  the  SmartConnectors  will  normalize  the  data  values  and  structures 
to  a  common  schema.  This  will  then  allow  SmartConnectors  to  filter  events  and  reduce 
the  volume  of  events  sent  to  the  ArcSight  Express,  which  in  turn  increases  its  efficiency 
and  accuracy  while  reducing  event  processing  time. 

3.  ArcSight  Express 

ArcSight  Express  is  the  heart  of  the  ESM  framework.  Commonly  known  as  the 
Manager,  it  is  a  licensed  hardware  appliance  that  drives  analysis,  workflow  and  other 
security  services.  It  writes  events  to  the  Correlation  Optimized  Retention  and  Retrieval 
Engine  (CORR-Engine)  as  they  stream  into  the  system  and  simultaneously  processes 
them  through  the  correlation  engine,  which  evaluates  each  event  with  network  model  and 
vulnerability  infonnation  to  develop  real-time  threat  summaries.  ArcSight  Express  is 
installed  with  a  Management  Console  to  provide  the  administrator  interface  for  managing 
user  accounts  and  events  storage. 

a.  CORR-Engine  Storage 

The  CORR-Engine  is  a  proprietary  data  storage  and  retrieval  framework 
that  receives  and  processes  events  at  high  rates  and  performs  high-speed  searches.  Unlike 
traditional  Relational  Database  Management  System  (RDBMS)-based  storage  that 
inhibits  high-speed  correlation  [6],  the  CORR-Engine  is  able  to  access  the  vast  amount  of 
stored  events  at  a  greater  speed,  therefore  allowing  the  correlation  engine  to  process 
logged  events  at  much  higher  rates  for  threat  detection  and  security  analysis. 
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b.  Management  Console 

The  Management  Console  is  essentially  an  embedded  web  application  that 
resides  within  the  ArcSight  Express  to  provide  a  streamlined  interface  for  managing  user 
accounts,  monitoring  events,  configuring  storage,  updating  licenses  and  managing 
component  authentication  (e.g.,  SmartConnectors’  connection). 

4.  ArcSight  Console 

Unlike  the  Management  Console  that  is  used  primarily  for  registration  of  users 
and  connecting  components,  the  ArcSight  Console  (Figure  5)  is  the  main  authoring  tool 
for  building  filters,  rules  and  security  reports.  It  is  a  separate  workstation  that  hosts  an 
application  interface  intended  for  use  by  SIEM  operators  to  build  complex  correlation 
rules  within  the  correlation  engine,  as  well  as  to  perform  routine  administrative  functions. 
In  addition,  the  console  also  provides  various  graphical  means  (e.g.,  dashboards  and  line 
graphs)  to  present  correlated  events.  Better  visual  representations  of  network  status 
improve  the  situational  awareness  of  the  SIEM  operator. 


Figure  5.  ArcSight  Console  Interface.  After  [5]. 
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B. 


WORKFLOW  STRUCTURE 


A  well-defined  workflow  structure  allows  S1EM  operators  to  understand  the 
procedural  approach  in  implementing  ArcSight  ESM.  In  addition,  it  also  provides  insight 
on  the  operational  processes  that  will  occur  at  each  step.  ArcSight  ESM  processes  events 
in  phases  to  identify  and,  optionally,  act  upon  events  of  interest.  An  overview  of  the 
major  steps  in  this  event  workflow  is  shown  in  Figure  6. 
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Figure  6.  Life  Cycle  of  ArcSight  ESM.  After  [5]. 
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1.  Phase  1  -  Key  Assets  Identification 

Identifying  key  assets  within  a  network  is  the  first  major  step  required  in  the 
SIEM  life  cycle,  as  it  will  determine  and  directly  influence  the  threat  evaluation  process 
associated  with  them.  When  an  operator  is  required  to  set  up  the  ArcSight  ESM,  he/she 
must  clearly  understand  the  network  topology  and  distinguish  the  importance  of  the 
various  subnets,  network  devices  and  end-points  (e.g.,  workstations  and  servers).  Those 
that  are  classified  as  sensitive  shall  then  be  identified  as  key  assets  and  thus  given  greater 
emphasis  during  the  processes  of  Asset  and  Network  Modeling.  Both  processes  are 
essential  functional  steps  for  ArcSight  ESM  to  become  “familiar”  with  the  network 
environment  that  it  is  protecting  and  perform  more  infonned,  meaningful,  event  threat 
evaluation  in  a  later  phase. 

a.  Asset  Modeling 

Modeling  assets  is  a  multi-step  process  that  requires  a  SIEM  operator  to 
consider  what  types  of  information  to  track  from  various  assets  in  the  network  and  how 
those  assets  interrelate.  The  distinctions  drawn  in  this  process  become  factors  for  the 
filters,  rules,  data  monitors  and  reports  that  will  be  used  to  correlate  events  generated. 

In  network  architecture,  an  asset  is  any  endpoint  that  is  considered 
significant  enough  to  be  characterized  with  details  that  will  make  correlation  and 
reporting  more  meaningful.  The  significance  can  be  determined  by  the  criticality  of  the 
information  that  is  stored  in  the  asset.  They  include  servers  and  laptops.  The  following 
information  is  required  to  establish  a  unique  identity  of  the  asset  that  will  be  recognized 
by  the  ESM: 

•  Asset  name  (a  name  used  to  refer  to  the  asset  within  ESM) 

•  Network  IP  address 

•  MAC  address 

•  Full  qualified  domain  name 

Knowing  the  identity  of  an  asset  is  not  enough  for  ESM  to  perform 
evaluation  on  the  events  generated.  To  evaluate  the  threats  or  behaviors  associated  with 
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the  asset,  it  is  crucial  to  include  more  information  pertaining  to  each  asset’s  configuration 
(Figure  7),  particularly  the  following: 

•  Vulnerabilities  -  A  vulnerability  description  is  used  to  define  the 
potential  exploitation  that  exists  in  any  hardware,  firmware  or 
software  state.  If  an  asset  meets  all  of  the  requirements  described 
in  the  vulnerability  description,  an  event  will  be  generated  by  ESM 
for  subsequent  evaluation. 
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ArcSight  Manager 
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b.  Network  Modeling 

Network  modeling  is  done  to  keep  track  of  the  asset  traffic  on  the  network 
that  is  being  monitored.  In  this  process,  the  physical  network  architecture  is  mapped  and 
segregated  into  various  functional  zones.  Every  network  device  in  the  network  is 
considered  a  separate  asset.  Examples  of  network-visible  interfaces  include  routers,  web 
servers  or  anything  with  an  IP  or  MAC  address. 

A  zone  represents  part  of  the  network  and  is  identified  by  a  contiguous 
block  of  IP  addresses.  It  usually  represents  a  functional  group  within  the  network  or  a 
subnet,  such  as  a  wireless  LAN,  VPN  or  DMZ.  Zones  are  also  how  ESM  resolves  private 
networks  whose  IP  ranges  may  overlap  with  other  existing  IP  ranges.  SmartConnectors 
tag  incoming  events  with  zones.  When  the  SmartConnector  processes  an  event,  it 
evaluates  each  of  the  IP  addresses  involved  in  the  event  and  tries  to  locate  the  zone 
associated  with  that  IP  address  among  an  ordered  list  of  networks.  Through  this  process, 
SIEM  operators  are  be  able  to  differentiate  the  events  collected  from  the  various  assets  in 
different  zones. 

2.  Phase  2  -  Data  Collection,  Normalization  and  Aggregation 

The  SmartConnector  is  the  interface  through  which  events  arrive  in  ESM  from 
devices.  It  performs  the  first  layer  of  event  tagging  by  applying  normalization,  filtering 
and  event  aggregation  to  reduce  the  volume  of  the  event  stream  to  make  event  processing 
faster  and  more  efficient  (Figure  8). 


Figure  8.  Data  Collection,  Normalization  and  Aggregation  Process.  After  [5]. 
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a.  Data  Collection 

Once  the  key  assets  have  been  identified,  the  next  process  is  to  gather 
information  from  these  assets.  The  data  collected  are  essentially  log  data,  where  each  log 
entry  is  translated  into  one  event  as  interpreted  by  the  SmartConnector. 

b.  Normalization 

Normalization  is  the  process  of  taking  events  from  disparate  devices  that 
are  presented  in  multiple  formats  and  recasting  them  into  a  common  schema  (syntax  and 
semantics).  Because  networks  are  heterogeneous  environments,  different  devices  will 
likely  have  dissimilar,  vendor-specific,  logging  formats.  This  lack  of  standardization 
hinders  event  correlation,  which  requires  that  “apples  be  compared  to  apples.”  As  such, 
the  SmartConnector  is  responsible  for  normalizing  the  events  gathered,  into  a  conforming 
event  schema.  This  is  achieved  through  a  parser  which  populates  the  native  form  values 
from  the  original  event  generator  into  the  corresponding  field  in  the  nonnalized  schema 
and  reformats  as  may  be  necessary  (e.g.,  MMDDYY  date  format  may  be  recast  into  DD- 
Month-YYYY  format).  Once  normalized,  the  raw  (native  form)  event  is  now  referred  to  as 
a  base  event.3  A  database,  specifically  the  CORR-database,  populated  with  these 
standardized  base  events  allows  for  consistent  reporting  and  forensic  analysis  to  be 
performed  on  every  event  regardless  of  its  original  data  source  or  format. 

Other  than  inheriting  the  original  values  from  the  source  device,  the 
nonnalization  process  also  converts  certain  values  for  the  following  event  field: 

•  Event  Severity  -  Some  of  the  reporting  devices  to  the 
SmartConnector  offer  initial  threat  evaluation  that  describes  their 
interpretation  of  the  danger  posed  by  a  particular  event.  For 
example,  if  a  network  IDS  detects  port  scanning  by  a  workstation 
with  an  external  IP,  the  network  IDS  will  flag  this  event  as  a  high- 
priority  exploit.  To  ArcSight  ESM,  this  is  known  as  device 
severity.  Similar  to  the  varying  logging  format,  the  scale  of  device 
severity  also  differs  between  security  devices.  For  example, 
network  Intrusion  Detection  System  (IDS)  A  may  use  a  security 
rating  of  1-10  while  network  IDS  B  uses  a  scale  of  high,  medium 
and  low.  In  this  case,  the  inherited  device  severity  readings  will  be 


3.  ArcSight  refers  to  an  event  that  has  been  processed  by  an  ArcSight  SmartConnector  as  a  base  event. 
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normalized  by  the  ArcSight  ESM  into  a  single  agent  severity  scale, 
called  agent  severity.  Both  device  severity  and  agent  severity  are 
important  data  points  that  will  be  used  to  calculate  the  event’s 
overall  security  priority. 

•  Timestamps  -  Another  factor  in  nonnalization  is  to  convert  the 
varying  timestamps  to  a  common  fonnat.  This  is  important  for  an 
enterprise  network  that  stretches  across  continents  with  different 
time  zones. 

c.  Aggregation 

The  purpose  of  aggregation  is  to  streamline  and  group  events  with  similar 
data  information,  thus  reducing  the  volume  of  base  events  sent  to  the  ArcSight  ESM.  It  is 
common  to  see  groups  of  events  arriving  at  the  SmartConnector  with  the  same  value  in  a 
specific  set  of  fields.  For  example,  a  network  device  may  submit  multiple  status  log 
entries  to  the  SmartConnector  over  a  short  period  of  time.  In  this  case,  the  events  will 
share  the  same  source  IP  address  and  event  name.  SmartConnector  aggregation  will  then 
merge  these  events  with  similar  values  into  a  single  aggregated  event  that  includes  the 
earliest  start  time  and  latest  end  time.  This  will  significantly  reduce  the  computation 
burden  on  the  ArcSight  ESM,  allowing  processing  to  occur  on  smaller  and  more 
manageable  data  sets. 

3.  Phase  3  -Event  Threat  Level  Evaluation 

One  of  the  main  security  functions  of  a  SIEM  product  is  to  accurately  identify  a 
malicious  event  that  is  happening  (or  has  happened)  within  the  network  to  the  SIEM 
operator  so  that  security  actions  can  be  taken  to  counter  or  otherwise  recover  from  the 
attack.  This  is  only  possible  after  the  events  have  been  evaluated  to  understand  their 
potential  threat  to  the  network’s  overall  security.  Evaluated  events  will  be  categorized 
into  different  threat  levels,  where  each  level  is  associated  with  a  series  of  security  actions 
that  can  be  invoked  to  reduce  or  eliminate  the  events’  adverse  impact. 

In  ArcSight  ESM,  once  the  base  events  have  been  processed  to  a  common  data 
set,  a  priority  formula  will  then  be  used  as  the  threat  level  metric  to  detennine  an  event’s 
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relative  importance  to  the  network.  This  is  an  important  function  as  it  serves  as  an 
indicator  to  signal  the  SIEM  operator  on  whether  the  evaluated  event  requires  further 
investigation  or  action.  A  high  priority  factor  generally  means  that  an  event  is  of  high  risk 
factor. 


(from  SmartConnector) 
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Figure  9.  Event  Threat  Level  Evaluation.  After  [5]. 
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The  priority  formula  consists  of  four  factors  that  combine  with  agent  severity  to 
generate  an  overall  priority  rating.  Each  of  the  criteria  described  in  Table  1  contributes  a 
numeric  value  to  the  priority  formula,  which  are  eventually  averaged  to  provide  the 
overall  importance  of  an  individual  event. 
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S/N 

PRIORITY 

FACTOR 

DESCRIPTION 

1 

Model 

Confidence 

Model  confidence  refers  to  whether  or  not  the  target  asset  has 
been  modeled  in  ESM  and  to  what  degree.  For  assets  that  have 
not  been  modeled  in  ArcSight  ESM,  the  associated  event  is 
regarded  with  lower  model  confidence,  which  in  turn  attributes 
to  lower  priority  rating. 

2 

Relevance 

As  mentioned  in  Phase  1,  attributing  assets  with  additional 
configuration  information  allows  in-depth  threat  evaluation  to 
be  performed.  In  this  factor,  relevance  refers  to  the  relevancy 
of  an  event  to  an  asset  and  it  looks  at  whether  the  event  targets 
port  and/or  vulnerabilities  of  a  particular  modeled  asset.  If 
relevance  is  high,  it  means  the  target  is  vulnerable  to  the  stated 
attack  and/or  the  stated  port  is  open. 

3 

Severity 

Unlike  device  or  agent  severity,  this  parameter  is  used  as  a 
history  function  for  the  priority  formula.  It  leverages  on  an 
ESM  resource4  called  active  list5  to  check  on  previous  events 
on  whether  an  associated  asset  has  been  listed  as  a  target  or 
attacker  before.  If  the  asset  appears  as  an  attacker  in  the  active 
list,  then  severity  will  contribute  to  a  higher  priority  rating. 

4 

Asset 

Criticality 

This  factor  measures  how  important  the  target  asset  is  in  the 
context  of  the  network  when  constructed  during  asset 
modeling.  If  an  event  reports  on  an  asset  which  is  categorized 
as  higher  criticality  due  to  its  access  to  confidential 
information,  then  this  factor  will  reflect  high  priority  rating. 

Table  1.  Factors  contributing  to  Priority  Formula.  From  [5]. 


The  calculated  priority  is  categorized  into  five  different  levels,  namely  very  low, 
low,  medium,  high  and  very  high.  While  the  lowest  level  usually  means  a  routine  event 
that  does  not  consist  of  threats,  a  high  priority  event  may  not  necessarily  mean  a  threat 
either.  For  example,  if  a  critical  asset  fails  due  to  internal  hardware  failure,  the  priority  of 
events  reporting  it  may  be  very  high,  but  it  does  not  represent  a  malicious  attack. 
Therefore,  it  is  always  important  for  a  SIEM  operator  to  conduct  further  investigation 

4.  ESM  manages  the  logic  used  to  process  evens  as  objects  called  resources.  A  resource  defines  the 
properties,  values  and  relationships  to  configure  the  functions  ESM  performs  [5]. 

5.  Active  Lists  are  used  as  like  a  bulletin  board  to  specify  fields  of  event  data  for  correlation  or 
monitoring  purposes. 
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(e.g.,  a  physical  check  on  the  reporting  device)  on  events  with  high  priority  ratings  so  as 
to  confirm  the  existence  and  characteristics  of  the  threats.  Once  confirmed,  a  pre¬ 
determined  series  of  security  actions  can  then  be  carried  out  to  mitigate  the  threats. 

4.  Phase  4  -  Events  Correlation 

Events  generated  from  multiple  endpoints  may  contain  a  mixture  of  normal  and 
malicious  event  information.  While  analyzing  a  single  event  in  isolation  is  rarely  enough 
to  confirm  its  significance,  the  difficulty  of  event  cross-referencing — correlation — 
increases  with  the  volume  of  events.  As  such,  automated  event  correlation  becomes  one 
of  the  most  important  value-added  functions  of  a  SIEM  product.  Event  correlation  is  a 
process  that  discovers  associations  among  different  but  related  events,  in  order  to  provide 
a  SIEM  operator  with  a  bigger,  more  accurate,  picture  than  a  single  event  would  provide 
in  isolation.  Correlation  links  multiple  events  together  to  detect  malicious  behaviors  that 
might  have  otherwise  been  missed  in  the  background  noise  associated  with  a  high  volume 
of  routine,  non-malicious  events  and  network  traffic. 

In  ArcSight  ESM,  once  events  have  been  normalized,  prioritized,  and  their 
endpoints  identified  within  the  asset  and  network  model,  they  are  processed  by  the 
correlation  engine,  which  is  where  filters,  rules  and  data  monitors  are  used  to  find  the 
events  of  interest  (Figure  10.  ).  ESM’s  correlation  tools  use  statistical  analysis,  Boolean 
logic  and  aggregation  to  find  events  with  particular  characteristics  as  specified  by  the 
operator. 
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Figure  10.  Events  Correlation  Process.  After  [5]. 
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a.  Filters 

Filters  are  limiting  gateways  that  help  to  separate  events  of  interest  from 
other  “noisy  events”  and  reduce  the  amount  of  events  that  are  processed  by  the  system.  It 
uses  a  set  of  conditions  that  focus  on  particular  event  attributes  to  perform  the  separation. 

Filters  are  applied  in  various  stages  of  the  event  life  cycle.  During 
SmartConnector  set  up,  the  connector  can  be  configured  with  filter  conditions  to  focus 
the  events  passed  to  the  ESM  according  to  specific  criteria  (Figure  1 1).  This  first  level  of 
filtering  prevents  those  events  that  do  not  meet  the  connector  criteria  from  being 
forwarded  to  the  ESM.  Filters  applied  at  the  subsequent  ESM  level  will  then  select  which 
events  it  will  process  based  on  the  conditions  specified  (Figure  12). 


Figure  1 1 .  SmartConnector  Filter  Configuration.  From  [8]. 
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Figure  12.  ESM  Filter  Configuration.  From  [7]. 

The  main  difference  between  both  levels  of  filtering  lies  in  their  storage  of 
rejected  events.  Only  those  rejected  events  from  the  ESM  are  stored  in  the  CORR-engine 
storage.  It  is,  therefore,  important  for  the  SIEM  operator  to  decide  where  to  apply  the 
filter.  While  applying  filters  in  the  first  level  can  significantly  improve  the  signal-to-noise 
ratio,  it  must  be  carefully  implemented  with  rigorous  checking  of  the  filter  conditions. 
This  verification  ensures  that  useful  events  are  not  inadvertently  left  out.  Rejected  events 
that  are  stored  in  the  ESM  provide  the  SIEM  operator  with  the  flexibility  to  access  those 
events  later  on  if  there  is  a  change  in  investigative  focus. 

b.  Rules 

Rules  are  the  core  tool  of  the  ESM  correlation  engine  as  they  aid  in 
revealing  the  broader  meaning  out  of  the  steady  stream  of  very  narrow  meaning 
individual  events.  A  rule  is  a  programmed  procedure  that  evaluates  incoming  events  for 
specific  conditions  and  patterns.  It  leverages  on  the  capability  of  filtering  to  first  find 
matching  base  events  and  then  provides  initiate  actions  in  response. 

As  events  of  interest  may  not  stream  into  the  Manager  in  a  consecutive 

manner,  the  rules  engine  first  looks  for  matches  that  fulfill  the  specified  conditions  and 
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holds  those  matches  in  working  memory.  With  more  events  streaming  in,  the  stored 
events  will  be  evaluated  against  the  incoming  events  for  aggregation  and  correlation. 
How  long  the  matching  events  will  be  stored  in  the  working  memory  depends  on  the  rule 
threshold.  The  threshold  is  part  of  the  rule  that  considers  how  many  matching 
occurrences  over  a  specified  time  frame  must  occur  before  the  rule’s  criteria  are 
considered  to  have  been  matched.  Those  partial  matches  and  thresholds  not  met  can  be 
discarded  from  the  working  memory. 

In  ArcSight  ESM,  rules  whose  conditions  have  been  met  will  generate  an 
ESM  event  called  a  correlation  event  (Figure  13).  Once  generated,  a  correlation  event 
will  go  through  the  event  life  cycle  again,  just  as  if  it  were  a  normalized  base  event 
reported  by  a  SmartConnector. 
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Figure  13.  Correlation  and  Base  events  on  ArcSight  Console.  After  [5], 


When  the  correlation  event  passes  through  the  correlation  engine  again,  it 
is  evaluated  by  other  rules  that  are  looking  for  correlation  events  with  matching 
attributes.  For  this  recursive  correlation  process  to  work  most  effectively,  the  rules 
leverage  the  following  two  ESM  resources: 

•  Active  Fists.  These  are  configurable  tables  that  act  like  a  common 
access  bulletin  board  where  specified  fields  of  event  data  are 
collected  and  retained  during  the  first  round  of  correlation  process. 
The  retained  event  data  found  in  these  lists  can  then  be  used  as 
conditions  evaluated  by  other  rules  in  a  subsequent  correlation 
evaluation  process.  Active  lists  can  be  used  by  the  operator  to 
associate  events  happening  in  one  area  of  the  network  with  events 
happening  in  another  area. 
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•  Session  Lists.  Unlike  active  lists,  session  lists  associate  users  with 
any  events  that  are  known  to  be  associated  with  them.  These  lists 
are  most  typically  used  for  identity  correlation,  such  as  to  track 
user  logins/logouts.  For  example,  when  a  user  logs  in  to  a  server, 
his/her  user  ID  is  added  to  the  session  list  with  a  start  time.  When 
he/she  logs  out,  an  end  time  is  appended  to  the  same  entry  in  the 
session  list.  This  will  allow  later  queries  to  the  session  list  to  check 
on  the  time  interval  over  which  a  particular  user  was  logged  into  a 
particular  server.  In  terms  of  persistency  of  lists  stored,  session 
lists  stay  in  the  system  longer  and  will  require  manual  purging  if 
removal  from  the  list  is  desired.  By  contrast,  active  lists  have  a 
Time-To-Live  (TTL)  parameter  that  enables  them  to  be 
automatically  purged  when  the  TTL  expires. 

SIEM  operators  should  be  familiar  with  the  usage  of  both  lists  so  that  the 
malicious  events  within  complex  and  varied  scenarios  can  be  tracked  by  the  recursive 
correlation  process. 

5.  Phase  5  -  Monitoring  and  Investigation 

ESM’s  normalization  and  correlation  processes  enable  SOC  personnel  to  have 
real-time  situational  awareness  as  events  occur.  After  these  backend  processes  (i.e., 
collection,  normalization,  filtering,  and  rule-evaluation)  have  completed  their  tasks,  the 
next  phase  will  be  to  present  derived  information  to  the  operator.  This  can  be  achieved  by 
using  the  various  monitoring  and  investigation  resources  within  ESM  (Figure  14). 
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Figure  14.  Monitoring  and  Investigation  Process.  After  [5]. 
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a.  Active  Channels 

Similar  to  a  channel  tuned  to  a  certain  frequency  on  a  television  set,  an 
active  channel  displays  a  stream  of  information  defined  by  parameters  set  in  the  active 
channel  editor.  As  mentioned  previously,  filters  are  used  across  the  event  life  cycle 
stages.  A  local  filter  condition  in  the  active  channel  helps  to  filter  the  stream  of 
nonnalized  base  events,  status  events,  and  correlation  events  flowing  through  the  ESM 
and  to  display  only  those  that  are  of  interest  to  the  operators.  In  addition,  an  in-line  filter 
can  be  configured  within  an  active  channel  to  further  refine  the  conditions  already  set  by 
the  local  filter.  Unlike  the  local  filter  that  allows  manipulation  of  different  event 
attributes,  an  in-line  filter  only  works  on  one  event  attribute  (i.e.,  schema  column)  at  a 
time  and  uses  the  logical  AND  operator  to  perform  the  condition  refinement.  The  in-line 
filter  offers  a  quick  and  flexible  way  for  the  SIEM  operator  to  further  examine  filtered 
events  that  works  on  the  base  conditions. 


Figure  15.  Active  channel  local  and  in-line  filters.  After  [5]. 
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The  following  two  types  of  active  channel  are  used  extensively  for 
monitoring  and  investigation: 

•  Live  Channel  -  This  channel  displays  real-time  events  and  reflects 
any  changes  at  its  next  refresh  cycle,  such  as  new  base  events 
arriving  from  SmartConnectors  or  when  a  user  annotates  an 
existing  event  with  an  investigation. 

•  Rules  Channel  -  During  situations  whereby  suspicious  behavior 
are  observed  and  new  rules  are  required  to  draw  on  different  fields 
of  event  data  for  correlation  purposes,  this  channel  allows  SIEM 
operators  to  test  newly-created  rules  on  a  fixed  time  window  with 
historical  events.  The  advantage  of  this  channel  is  that  rules  can  be 
tested  and  adjusted  accordingly  without  conflicting  with  the 
ongoing  rules  that  are  being  actively  applied  to  the  real-time 
events.  Once  the  rules  are  proven  to  work  as  expected,  they  can 
then  be  linked  to  the  Real-Time  Rules  folder,  which  holds  all  the 
rules  that  act  on  the  real-time  events. 

b.  Dashboard 

ESM  also  uses  another  resource  type  called  a  dashboard,  which  is  used  to 
present  events  in  various  user-chosen  formats.  Instead  of  using  the  typical  matrix  table  as 
shown  in  the  active  channels,  dashboards  are  capable  of  showing  events  in  a  variety  of 
graphical  and  tabular  fonnats  that  summarize  selected  collections  of  events.  Within  a 
single  dashboard,  there  can  be  multiple  data  monitors  that  provide  different  graphical 
event  presentations.  For  example,  if  a  SIEM  operator  wants  to  have  a  continuous  update 
on  the  overall  state  of  an  enterprise’s  network  security,  he/she  can  create  the  various  data 
monitors  as  shown  in  Figure  16. 
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Figure  16.  Security  Activity  Statistics  Dashboard.  From  [5]. 


Data  monitors  work  essentially  the  same  as  rules  in  that  they  evaluate  the 
event  stream  and  aggregate  events  with  common  elements.  Rules  focus  on  inferring 
meaning  from  certain  event  conditions  and  combinations  to  enhance  interpretation  and, 
optionally,  to  specify  certain  actions.  Data  monitors,  on  the  other  hand,  focus  on 
summarizing  event  data  and  presenting  them  graphically.  They  cannot  be  used  to  trigger 
actions  like  rules.  As  such,  data  monitors  provide  another  type  of  analysis,  such  as 
calculating  statistics  and  moving  data  averages. 

The  benefit  of  having  a  composite  dashboard  is  that  it  offers  an  “all-in- 
one-glance”  panel  that  summarizes  multi-node  enterprise  security  data.  This  makes  it 
easier  to  visualize  attack  patterns  among  nodes  on  the  network.  In  certain  situations,  a 
causal  relationship  between  events  can  also  be  discovered.  For  example,  an  attacker  has 
been  attempting  to  hack  into  an  asset  through  various  means.  By  selecting  those  events 
together  and  investigating  them  with  the  ESM  graphical  tools,  a  causal  relationship 
between  the  attacker  and  the  asset  can  be  more  easily  seen  (Figure  17). 
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Figure  17.  Causal  relationship  between  attacker  and  asset.  From  [5], 


6.  Phase  6  -  Reporting 

With  all  the  events  processed  and  stored  in  the  CORR-Engine  storage,  the  next 
logical  step  is  to  produce  reports  that  can  be  archived  and/or  provided  in  support  of 
accreditation  or  other  established  security  procedures.  In  the  ESM,  a  summarized 
collection  of  information  pertaining  to  any  particular  incident  is  known  as  a  report. 
Reports  are  generated  either  manually  or  via  a  pre-defined  rule  action.  Two  ESM 
resources  utilized  in  the  report  creation  process  are  the  Query  and  Template  resources. 


a.  Query  Resource 

A  query  is  a  request  action  to  a  data  source  that  reports  events  to  ESM. 
Based  on  the  query  statement,  certain  parameters  of  the  data  can  then  be  gathered.  Some 
of  the  data  sources  are  CORR-Engine  storage  and  data  stored  in  active  lists  or  session 
lists. 
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b.  Template  Resource 

Templates  are  another  fonn  of  resource  that  defines  the  structure  in  which 
data  results  from  the  report  are  presented.  The  layout  specification  can  be  tailored  by  the 
SIEM  operator  to  suit  an  enterprise’s  desired  reporting  format  and  can  enforce  the 
necessary  standardization. 


Figure  18.  Customer  Template  Design.  From  [5]. 


The  procedural  approach  in  implementing  ArcSight  ESM  has  been 
explained  in  this  chapter,  with  specific  emphasis  on  how  the  SIEM  operator  can 
effectively  utilize  the  various  tools  and  resources  to  conduct  the  forensic  activities. 
Certain  resources  of  ArcSight  ESM  are  deliberately  left  out  of  the  explanation  as  they 
either:  a)  require  additional  licensing  modules  (e.g.,  Pattern  discovery  &  Interactive 
discovery)  which  are  not  available  in  the  standard  package  or  b)  will  not  be  utilized  in  the 
training  scenarios.  Despite  this,  the  chapter  has  covered  the  core  functional  processes  that 
are  essential  to  the  operation  of  a  typical  SIEM  product,  and  this  should  facilitate  the 
basic  understanding  required  to  follow  the  material  presented  in  subsequent  chapters. 
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IV.  TRAINING  STRATEGY 


As  part  of  the  thesis  objectives,  a  repeatable  educational  framework  has  to  be 
created  in  order  to  facilitate  the  SIEM  training.  This  chapter  outlines  the  training  strategy 
that  will  be  used  to  achieve  the  aforementioned  objective.  It  will  include  the  tools, 
learning  goals  and  method  used  to  develop  the  scenario-based  training  lab  for  SIEM 
operators. 

A.  REPLAY  TOOLS 

Traffic  replay  is  one  of  the  key  features  of  a  SIEM  product.  The  ability  to  replay 
pre-captured  events  is  particularly  useful  for  the  following  purposes: 

•  Testing  -  This  allows  stress  testing  to  be  conducted  for  the  ArcSight 
configuration  and  deployment,  while  identifying  possible  loopholes  and 
vulnerabilities. 

•  Training  -  For  ease  of  demonstration  and  training  for  new  cadres  of  SIEM 
operators. 

In  ArcSight,  the  replay  events  and  connectors  are  the  replay  tools.  Similar  to  the 
SmartConnectors,  the  replay  connectors  are  used  primarily  to  feed  the  ArcSight  Express 
with  base  events.  However,  there  is  a  difference  in  the  timestamp  of  the  events  provided. 

The  SmartConnectors  lie  in  between  the  network  devices  and  ArcSight  Express. 
Real-time  raw  events  generated  from  the  network  devices  stream  through  the 
SmartConnectors  and  undergo  filtering,  aggregation,  nonnalization  and  categorization 
prior  to  arriving  at  the  ArcSight  Express.  In  contrast,  the  replay  connectors  are  standalone 
modules  that  are  not  associated  with  any  network  devices  (Figure  19).  Rather  than  “live” 
streaming,  the  replay  connectors  provide  the  replay  of  pre-captured  (historical)  events 
saved  from  past  activities. 

Despite  the  difference  in  the  timestamp,  ArcSight  Express  handles  every  base 
event  received  in  the  same  manner.  When  the  events  arrive,  ArcSight  Express  subjects 
the  events  to  the  real-time  rules  for  filtering  and  correlation  purposes. 
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Figure  19.  ArcSight  replay  connectors’  connectivity.  After  [5]. 


B.  LEARNING  GOALS 

The  learning  goals  are  stated  to  guide  the  construction  of  the  method,  content  and 
structure  for  the  learning  lab.  The  learning  goals  include  the  steps  to  create  the  lab 
environment  and  the  important  concepts  and  skills  that  will  be  taught  to  the  SIEM 
operators. 

1.  Understand  the  Behavior  of  a  Cyber  Threat 

Without  a  clear  understanding  on  the  behavior  of  the  cyber  threat,  SIEM  operators 
will  experience  great  difficulties  in  conducting  cyber  forensic  analysis.  Therefore,  it  is 
important  to  study  the  behavior  of  the  cyber  threat  as  it  will  build  the  knowledge 
foundation  in  searching  for  malicious  indicators  among  the  myriad  events  interspersed 
within  the  seemingly  normal  cyber  activities.  An  understanding  of  the  interrelation 
between  a  given  cyber  threat  and  its  associative  event-based  indicators  will  also  allow 
SIEM  operators  to  utilize  the  ArcSight  resources  for  various  core  processes  more 
effectively. 
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In  the  realm  of  computer  networks,  many  cyber  threats  exist.  Considering  the 
non-distinctive  behaviors  that  can  be  exhibited  by  the  various  threats,  it  is  practically 
impossible  to  create  a  single  use  case  or  rule  that  will  reliably  detect  all  threats.  As  such, 
the  learning  lab  will  only  focus  on  a  particular  cyber  threat.  In  this  way,  the  complexities 
involved  in  rules  configuration  will  be  significantly  reduced,  and  hence  allow  the  SIEM 
operators  to  concentrate  on  learning  the  concepts  and  skills  required  for  cyber  forensic 
analysis. 

2.  Set  up  Lab  Environment 

Before  a  SIEM  operator  can  embark  on  the  learning  process,  the  lab  environment 
must  be  set  up  correctly.  The  environment  will  require  a  collection  of  replay  events  that 
can  demonstrate  the  behavior  of  a  particular  cyber  threat.  Then  the  replay  connector  will 
be  leveraged  as  the  events  generator  to  provide  a  steady  stream  of  events  to  the  ArcSight 
Express. 

3.  Construct  Network  Topology  through  “Reverse-Engineering”  Method 

The  main  purpose  of  the  “reverse-engineering”  method  is  to  allow  the 
construction  of  the  network  topology  using  the  information  within  the  base  events. 
During  the  research  phase  of  this  thesis,  it  was  realized  that  the  replay  events  obtained 
were  not  accompanied  by  notes  regarding  the  topology  of  the  network  from  which  the 
replay  events  were  collected.  As  previously  mentioned,  the  network  topology  aids  in 
network  and  asset  modeling  and  is  a  crucial  step  required  in  the  workflow  process  to 
properly  characterize  and  prioritize  information  gleaned  from  the  raw  events  provided  by 
the  reporting  devices.  While  this  situation  of  an  “unfamiliar”  network  is  unlikely  in  a 
real-world  operational  environment,  it  is  necessary  to  deal  with  this  in  the  context  of  this 
thesis:  owing  to  the  goal  of  being  able  to  utilize  replay  events  that  have  been  collected 
from  various  networks,  yet  process  them  as  though  they  were  from  one’s  own — 
familiar — operational  network. 
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4. 


Associate  Threat  Behavior 


All  the  ArcSight  core  processes  depend  on  the  parameters  of  the  event  schema  to 
perform  their  functions.  For  example,  during  the  aggregation  process,  the  rules  may  use 
the  “Source  IP  address”  parameter  to  combine  the  network  traffic,  while  in  the  correlation 
process  the  “Agent  Hostname”  parameter  can  be  used  to  recognize  this  common 
association  between  events.  With  an  understanding  of  a  given  cyber  threat’s  behavior, 
SIEM  operators  will  be  better  able  to  correlate  collected  events  to  indications  of  that 
threat. 


5.  Configure  Rules 

Rule  writing  is  an  important  skill  for  SIEM  operators  since  a  SIEM  product  is  a 
passive  system.  As  such,  it  requires  rules  to  provide  the  logical  relationships  that  are 
deemed  likely  indicators  of  threats  or  exploited  system  behavior.  Identified  threat 
characteristics  are  the  conditional  parameters  for  the  structuring  of  rules.  With  an 
understanding  of  these  characteristics,  SIEM  operators  can  intelligently  modify,  or  even 
create,  rules. 


6.  Present  and  Report  Detected  Threats 

ArcSight  offers  various  graphical  resources  to  represent  network  activities  and 
detected  threats.  Many  are  of  these  graphical  resources  are  rather  intuitive,  permitting  the 
SIEM  operator  to  quickly  understand  the  current  situation.  The  lab  will  introduce  some  of 
the  primary  graphical  resources  that  can  be  used  to  represent  the  network  activity  of 
interest  and  detected  threats.  Subsequently,  the  ArcSight  report  resource  will  be  utilized 
to  create  a  summary  of  detected  threat  indications  that  can  be  saved  and/or  printed  out  for 
archiving  purposes. 

C.  METHOD 

Scenario-based  training  will  be  adopted  as  the  methodology  in  the  learning  lab  to 
educate  the  SIEM  operators.  The  scenario  will  closely  mimic  a  “real-world”  situation  so 
that  the  learning  experience  is  practical  and  can  be  applicable  to  other  similar  situations. 
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1. 


Scenario  Overview 


A  computer  worm  is  a  form  of  cyber  threat  that  is  encountered  by  many 
networked  systems.  A  worm  is  malware  that  does  not  rely  on  human  intervention  to 
execute  and  is  capable  of  replicating  itself  and  spreading  within  and  across  computer 
networks.  A  wonn  targets  certain  security  vulnerabilities  and  often  causes  hann  to  the 
network,  such  as  consuming  bandwidth  capacity,  thus  making  services  unavailable  to 
legitimate  users  and  applications. 

As  a  simple  demonstration,  the  model  scenario  used  in  this  thesis  is  based  on  a 
computer  worm  that  does  not  carry  malicious  payload.  Recall  that  a  SIEM  product  is 
used  to  analyze  events  reported  by  network  devices  rather  than  (itself)  analyzing  the 
semantics  of  the  “raw”  stimuli  that  elicited  the  various  events  from  the  monitored 
network  devices.  The  replay  events  for  this  model  scenario  will  consist  of  thousands  of 
entries  that  will  simulate  a  worm  outbreak  situation  where  the  worm  spreads  within  and 
across  multiple  networks.  This  behavior  of  the  typical  computer  worm  will  serve  as  the 
model  threat  for  inexperienced  SIEM  operators  to  conduct  analysis. 

The  “storyboard”  of  the  scenario  will  revolve  around  a  fictional  organization  that 
is  hosting  an  online  Web  portal  called  “SporeFusion.com.”  According  to  our  exercise 
story,  in  recent  months  this  fictional  organization  has  been  facing  an  increasing  number 
of  cyber-attacks  and  has  suffered  financial  loss  during  some  of  the  more  pernicious 
attacks.  Seeing  the  need  to  protect  their  business,  the  organization  has  decided  to 
purchase  ArcSight  ESM  as  a  detective  security  control  against  these  threats.  The  IT 
employees  within  the  organization  are  tasked  with  the  mission  of  utilizing  ArcSight  ESM 
to  detect,  report,  and  recover  from  threats  as  quickly  as  possible. 

2.  Training  Phases 

The  learning  goals  stated  will  form  the  structure  of  the  lab,  with  each  goal 
translated  to  a  phase.  In  most  of  the  phases,  the  SIEM  operators  are  required  to  interact 
with  the  ArcSight  ESM.  The  different  phases  are  intended  for  progressive  learning  in 
which  the  operators  will  obtain  the  necessary  knowledge  and  skillset  from  the  earlier 
phases,  which  are  necessary  to  complete  tasks  required  in  later  phases. 
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The  first  phase  will  require  the  SIEM  operators  to  study  the  behavior  of  computer 
worms  and,  in  particular,  understand  their  “self-replicating”  and  “spreading”  traits.  The 
second  phase  will  then  focus  on  the  installation  of  the  replay  connector  on  the  host 
machine  and  configuration  of  the  replay  events  to  generate  the  event  flow  to  ArcSight 
Express.  The  third  phase  will  demonstrate  a  way  to  infer  the  network  topology  of  the 
capture  environment  using  ArcSight  graphical  resources.  The  fourth  phase  will  expose 
SIEM  operators  to  many  of  the  most  prominent  worm  attack  indicators  that  can  be  used 
to  associate  to  the  parameters  of  the  event  schema.  The  fifth  and  sixth  phases  are  the 
highlights  of  the  learning  lab;  where  the  SIEM  operators  will  step  through  the  workflow 
process,  construct  a  set  of  logical  worm  detection  rules  (the  fifth  phase),  and  ultimately 
present  their  worm-detection  analysis  as  an  incident  report  (the  sixth  phase).  These  last 
two  phases  will  require  more  time  than  the  previous  four  phases  due  to  the  trial-and-error 
nature  of  getting  the  best  rules  to  elicit  the  desired  response  (a  high  confidence  detection 
alert)  from  the  base  events  involved. 

D.  SUMMARY 

This  chapter  has  presented  the  general  approach  for  the  design  of  a  worm 
outbreak  scenario  that  models  a  realistic  cyber-attack  issue  faced  by  many  Internet-facing 
enterprises  nowadays.  The  activities  and  tasks  associated  with  the  six  phases  will  cover 
the  learning  objectives  stated  in  Chapter  I.  The  next  chapter  will  provide  more  detail  on 
the  phases  and  how  various  ArcSight  resources  can  be  utilized  to  achieve  the  objectives. 
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V.  TRAINING  SCENARIO 


This  chapter  will  elaborate  on  the  training  methodology  applied  for  the  worm- 
outbreak  scenario.  The  content  will  be  categorized  into  six  training  phases  and  will 
include  the  expected  results. 

A.  SCENARIO  BRIEFING 

The  purpose  of  this  scenario  briefing  is  to  provide  SIEM  operators  with  the 
background  context  for  their  cyber-forensic  analysis.  The  scenario  describes  a  cyber¬ 
threat  (computer  worm)  faced  by  a  fictional  organization  that  is  hosting  an  online  Web 
portal  called  “SporeFusion.com”  and  is  summarized  as  follows: 

“SporeFusion.com”  belongs  to  a  company  that  provides  personal  photo 
service  for  subscribed  customers  to  customize  their  digital  photos  with  the 
company’s  online  photo  editing  tools.  It  also  allows  the  customers  to  print 
their  personalized  photos  on  an  array  of  products  (e.g.,  mugs,  shirts).  The 
company  maintains  its  own  IT  infrastructure.  The  infrastructure  supports 
two  networks:  an  external  network  to  provide  their  online  services  to  the 
customers  and  an  internal  network  to  maintain  their  proprietary  company 
information  (e.g.,  sales,  orders  and  inventory). 

Over  the  years,  the  organization  has  been  suffering  from  periodic  cyber¬ 
attacks,  some  of  which  have  resulted  in  financial  loss  due  to  lost  customer 
satisfaction  or  added  service  costs.  The  most  serious  attack  rendered  the 
online  portal  unavailable  for  five  days,  resulting  in  great  financial  loss 
during  that  period.  A  preliminary  investigation  by  the  company’s  own  IT 
employees  discovered  it  to  be  a  denial-of-service  attack  caused  by  a 
computer  worm.  Despite  this  discovery,  the  company  was  unable  to  trace 
the  source  of  the  attack  or  determine  the  extent  of  infected  hosts  within 
their  network. 

To  improve  this  situation  and  improve  their  overall  IT  security  posture,  the 
company  has  decided  to  purchase  ArcSight  ESM  as  a  detective  security 
control  against  these  threats.  The  IT  employees  within  the  organization  are 
tasked  with  the  mission  of  utilizing  ArcSight  ESM  to  detect  and  recover 
from  threats  as  quickly  as  possible. 
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B. 


PHASE  1  -  UNDERSTAND  THE  BEHAVIOR  OF  A  COMPUTER  WORM 


The  first  phase  will  require  SIEM  operators  to  conduct  a  study  on  the  behavior  of 
a  computer  worm.  Through  this  study,  the  SIEM  operators  will  leam  and  understand  the 
worm’s  associative  traits,  which  will  then  become  the  target  of  analysis.  The  following 
highlights  some  of  the  fundamental  information  pertaining  to  the  computer  worm  that 
must  be  understood  by  the  SIEM  operators  during  their  study: 

•  The  worm  usually  originates  from  a  single  host. 

•  The  wonn  has  the  ability  to  self-replicate.  This  means  that  once  it  infects  a 
host,  the  worm  will  understand  what  network  it  is  on  and  will  attempt  to 
find  nearby  host(s)  (likely  to  be  a  host  with  the  same  vulnerability  and  lies 
within  the  same  network)  and  attempt  connections  to  the  host(s)  through 
various  sockets/ports.  Thereafter,  these  newly  spread  worms  will  repeat 
the  same  actions  on  their  newly  infected  hosts. 

C.  PHASE  2  -  SET  UP  REPLAY  ENVIRONMENT 

The  next  phase  is  to  install  the  replay  events  and  connector  for  continuous  testing 
and  analysis  on  a  set  of  pre-cap tured  events.  (Refer  to  the  Appendix  for  the  installation 
guide.)  After  successful  installation,  the  replay  graphical  user  interface  (GUI)  can  be 
activated  through  the  command  prompt  in  administrator  mode. 


ram  Command  Prompt 


Microsoft  Windows  tilers  ion  6.1.76011 

Copyright  <c>  2009  Microsoft  Corporation.  All  rights  reserved. 
C:\Users\Tenshin >cd. . 

C:  Misers >cd. . 

C:\>cd  program  files  <x86>\arcsightsmartconnectors\current\bin 

C:\Program  Files  <x86>\ArcSightSmartConnectors\current\bin>arcsight  agents.. 


#  #  1  ^  i  M 


Figure  20.  Replay  connector  activation. 


The  replay  GUI  is  the  main  control  mechanism  that  initiates  the  events 
transmission  to  ArcSight  ESM.  Once  the  replay  files  have  been  placed  in  the  correct 
folder  as  stated  in  the  Appendix,  the  replay  GUI  will  automatically  load  all  the  stored 
events  and  allow  SIEM  operators  to  select  the  desired  set  of  events  to  be  streamed  into 
ArcSight  ESM  (Figure  21).  The  GUI  also  provides  a  time  slider  and  events  transmission 
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rate  option  (e.g.,  events/sec  or  events/min)  to  manipulate  the  rate  of  events  being 
transmitted. 


Figure  2 1 .  Replay  connector  GUI. 

The  base  events  stored  within  the  file  can  be  accessed  by  “right-clicking”  on  the 
selected  file  and  choosing  “Open  Selected.”  A  new  screen  will  appear  with  all  the  base 
event  entries  as  shown  in  Figure  22. 
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Figure  22.  Access  base  event  entries. 
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For  this  test  scenario,  the  “WormOutbreak.events”  file  is  selected.  As  seen  in 
Figure  23,  each  row  represents  one  base  event,  while  each  column  represents  a  parameter 
of  the  ArcSight  event  schema.  The  cell  holds  the  value  of  the  parameter  associated  with 
that  particular  base  event. 
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Figure  23.  Event  Entries  in  “WormOutbreak.events”  file. 

D.  PHASE  3  -  CONSTRUCT  NETWORK  TOPOLOGY  THROUGH 
“REVERSE-ENGINEERING”  METHOD 

ArcSight  graphical  resources  are  particularly  useful  in  analyzing  the  traffic  flow 
within  a  set  of  replay  events.  The  resources  work  by  leveraging  the  parameters  within  the 
event  schema  (e.g.,  source  and  destination  IP  addresses,  device  vendor  name  and  zone)  to 
map  out  the  networks,  associated  network  devices  and  the  communication  flow  between 
these  devices.  With  a  basic  understanding  on  IP  networking  (e.g.  RFC19186  private 
networks),  it  is  therefore  possible  for  SIEM  operators  to  utilize  the  graphical  resources  to 
“reconstruct”  the  network  topology  from  a  set  of  replay  events. 


In  normal  circumstances,  the  reconstruction  method  must  be  applied  to  all  the 
events  in  the  replay  file  so  as  to  obtain  the  full  network  topology,  which  may  consist  of 


6.  RFC  1918  are  standards  that  define  the  private  IP  address  space. 
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many  sub-networks.  For  the  ease  of  illustration,  the  following  example  will  focus  on  a 
small  subset  of  the  events  within  the  replay  files.  This  is  done  because  the  particular 
replay  file  used  in  our  example  was  collected  from  a  rather  large  volume  of  IP  space;  and 
viewing  all  of  this  space  using  the  SIEM’s  graphical  utilities  would  result  in  a  very 
“crowded”  topological  depiction.  By  limiting  the  demonstration  to  a  smaller  subset  of  the 
entire  IP  space  (by  selecting  a  subset  of  the  collected  events),  the  resulting  graph  will  be 
easier  to  see  via  the  below  screen  captures. 

Prior  to  viewing  the  base  events  in  ArcSight  Console,  the  SIEM  operators  must 
first  create  an  active  channel.  Periodically,  ArcSight  ESM  will  generate  internal  system 
events  that  report  its  current  status  (e.g.  user  log-in).  To  prevent  such  events  from 
clustering  with  the  events  of  interest,  the  SIEM  operators  will  use  the  active  channel 
editor  to  filter  out  the  internal  system  events.  In  addition,  a  customized  grid  field  object 
will  be  created  to  limit  the  number  of  viewable  events’  parameters.  In  our  training 
scenario,  both  the  “Agent  Type”  and  “Severity”  parameters  will  be  set  to  “syslog_pipe” 
and  “>=  3,”  respectively  (with  an  OR  condition),  in  the  active  channel  editor  to  display 
the  base  and  correlation  events  pertaining  to  the  worm  outbreak  scenario. 
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Figure  24.  Create  Active  Channel  in  Navigator  Panel. 
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Figure  25.  Customize  Active  Channel. 
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Figure  26.  Customize  Grid  Fields. 


Figure  27.  Create  filter  in  Active  Channel  Editor. 
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Next,  start  the  replay  connector  to  allow  the  base  events  to  flow  into  ArcSight 
Express. 
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Figure  28.  Start  replay  connector. 
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22)  [3:37:36]  4  of  the  121  fields  in  the  selected  field  set  for  condition  editor  are  custom  fields  or  not  supported  fields  that  cannot  be  displayed.  ~T~ 1 


Figure  29.  “WormOutbreak”  Base  events  in  ArcSight  Console. 
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Once  all  the  base  events  have  been  received,  ArcSight  Console  will  be  able  to 
process  the  encapsulated  information  and  display  the  inter-connecting  network  devices 
through  its  graphical  resources. 
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Figure  30.  Select  a  subset  of  events  and  choose  “Event  Graph.” 


Figure  3 1 .  Choose  color  code  for  different  nodes. 
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[41)  [2:42:57]  4  of  the  121  fields  in  the  selected  field  set  for  condition  editor  are  custom  fields  or  not  supported  fieldsthat  cannot 


Figure  32.  Overall  graph  view  using  organic  layout. 
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Figure  34.  Zoomed  in  view  of  Network  ‘B’  and  ‘C.’ 
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Figure  35.  IP  address  of  network  device. 
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Figures  32,  33,  and  34  show  the  inter-connecting  networks.  The  white,  blue  and 
red  boxes  represent  the  hosts,  while  the  arrow  lines  indicate  the  transmission  flow 
direction  between  the  hosts.  From  the  figures,  you  can  see  that  a  series  of  transmissions 
have  been  initiated  by  different  hosts  in  the  following  order: 

•  Public  host  (IP  address  =  206.116.23.54)  to  public  hosts  in  Network  ‘A’ 
(65.85.126.x).  Transmission  is  pennitted  as  indicated  in  Figure  35. 

•  Public  host  in  Network  ‘A’  (IP  address  =  65.85.126.60)  to  private  hosts  in 
Network ‘B’ (10.0.111.0). 

•  Private  host  in  Network  ‘B’  (IP  address  =  10.0.1 1 1.39)  to  private  hosts  in 
Network ‘C’  (10.0.20.x). 

Network  ‘B’  and  Network  ‘C’  are  private  networks  because  they  contain  IP 
addresses  that  belong  to  the  private  address  space  as  defined  in  RFC  19 18.  With  this 
analysis,  the  above  transmission  trend  can  be  interpreted  as  a  typical  traffic  flow  that 
usually  happens  when  an  external  host  attempts  to  communicate  with  an  internal  host  that 
is  secured  within  a  DMZ.  For  this  case,  the  situation  can  be  translated  to  the  training 
scenario  as  such: 

•  A  “foreign”  (not  part  of  SporeFusion)  host  on  the  Internet  (IP  address  = 
206.1 16.23.54)  has  sent  an  email  containing  a  computer  worm  to  an  email 
relay  server  (IP  address  =  65.85.126.60)  hosted  by  “SporeFusion.com.” 
This  email  relay  server  lies  in  SporeFusion’s  DMZ  zone,  Network  ‘A’, 
and  is  responsible  for  relaying  emails  from  external  networks  to  internal 
networks  and  vice  versa. 

•  The  computer  worm  infected  the  email  relay  server,  causing  the  mass 
sending  of  emails  to  other  hosts  within  SporeFusion’s  internal  (in  relation 
to  the  DMZ)  network,  Network  ‘B’  (10.0.1 1 1.x). 

•  One  of  the  hosts  (IP  address  =  10.0.111.39)  is  infected  by  the  computer 
wonn  and  in  turn  forwards  the  mail  to  the  hosts  within  another  internal 
network,  Network  ‘C.’ 

•  From  Figure  35,  the  Cisco  router  (IP  address  =  65.85.126.1)  is  the  firewall 
that  implements  the  DMZ  and  separates  the  internal  and  external 
networks. 

Using  the  above  information,  the  network  topology  of  the  subset  events  can  then 
be  constructed  as  shown  in  Figure  36.  . 
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Figure  36.  Simplified  Network  Topology. 

SIEM  operators  can  now  begin  to  follow  through  the  workflow  structure  as 
described  in  Chapter  III  by  starting  with  key  asset  identification.  Supposedly,  the  finance 
manager  of  the  “SporeFusion.com”  organization  is  given  a  personal  computer  (IP  address 
=  10.0.111.39)  to  carry  out  his/her  work  routine.  The  computer  contains  a  lot  of 
confidential  information  pertaining  to  the  financial  health  of  the  organization  and  is 
therefore  identified  as  a  key  asset.  ArcSight  has  preemptively  modeled  the  private 
network  space  (IP  address  =  lO.x.x.x)  into  the  Zone  and  Network  models  using  the 
globalized  Internet  standards  (e.g.,  RFC  19 18),  as  shown  in  the  following  figures: 
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With  the  network  models  already  available  (as  per  the  original  networks  from 
which  the  replay  traffic  was  collected),  the  SIEM  operators  will  only  have  to  create  the 
asset  model  for  the  personal  computer  that  belongs  to  the  Finance  Manager  and  link  it  to 
the  zone.  Arc  Sight  has  provided  a  “System  Asset  Category”  that  helps  to  define  the 
importance  of  the  asset  in  the  perspective  of  the  organization.  Since  the  personal 
computer  is  a  key  asset  to  the  organization,  it  will  be  tagged  with  “very  high”  criticality. 
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Figure  39.  Create  asset  model  step  1. 
51 


"|_t.  ij=)  1^3 


ArcSight  Console  5.1.0.1281.3  [arcsightexsvnadmin.ast]  Permanent  license.  Customer  Arcsight 


File  Edit 

□  -  i 

View 

j  « | 

Window 

j  i<m  | 

Tools  Syste 

n 

" 

Help 

\9%  -] 

Under  “Categories” 
tab  for  the  asset 

ion 

1 

m 

■■■■ 

■  * 

Navigator 

S'  v  x  inspect/ toic  t 

ri1  ? 

Resources  Packages  |  Use  Cases 
Ip  Assets 


Event  Inspector 


ip  Asset: SporeFusion  Finance  Mgr 


Assets  Zones  Networks  Categories  Vulnerabilities  Locations 


Click  on  “Add...’ 
button 


SporeFusion  Financ 


I  '•  "  I  -ate  Interfaces  Vulnerabilities  Notes 


c{}a  Add.. 


No  matching  Local  Asset  Categories  found. 


f  Edit  Remove  Refresh 


Local  Asset  Categories 


Asset  Categories  Selector 


"53 


! 

Select  “Very  high” 
under  System  Asset 
Criticality 

0 

B  E5  Asset  Categories 
0--I&  Shared 

ir®  All  Asset  Categories 
B  rm  ArcSight  Solutions 
GB T”  lf^  ArcSight  System  Administration 

+  rm  DOD 

JumpStart 

+  r®  Site  Asset  Categories 


B  T”  Criticality 
W  Cfl  High 
$ra  Low 
■  i  I  [i’lnlll  im 


EUbi  i 

^  1  sr  r®  www 

+  Q  Unassigned 


Figure  40.  Create  asset  model  step  2. 


By  creating  an  asset  model  for  the  personal  computer,  ArcSight  Express  will  now 
be  able  to  analyze  the  asset  with  greater  detail  and  provide  a  better  threat  level  evaluation, 
which  contributes  to  event  priority  determination.  Figure  41  illustrates  how  this  added 
asset  characterization  contributes  to  the  situational  awareness  of  the  operator. 
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Figure  41 .  Priority  level  difference  for  modeled  and  non-modeled  asset. 


52 


E. 


PHASE  4  -  ASSOCIATE  WORM  BEHAVIOR 


With  the  fundamental  information  gathered  in  Phase  1,  Phase  4  will  focus  on 
associating  the  following: 

•  The  worm’s  behavior  with  the  parameters  of  the  event  schema. 

•  The  communication  pattern  of  a  wonn  outbreak. 

Computer  worms  usually  originate  from  a  single  host  and  attempt  to  spread  to 
other  hosts  in  the  same  network.  Based  on  this,  the  “Source  IP  address,”  “Destination  IP 
address,”  “Port”  and  “Zone  Name”  serve  as  good  candidate  cueing  parameters  to  track 
the  spreading  effect.  Other  than  those  IP  addresses,  the  ArcSight  ESM  also  introduced 
two  more  virtual  parameters,  “Attack  IP  Address”  and  “Target  IP  address,”  that  can  be 
used  to  identify  a  worm’s  infected  source  and  target  destination.  The  reason  for 
introducing  the  additional  parameters  can  be  explained  as  such:  A  Network  Intrusion 
Detection  System  (NIDS)  has  reported  attacks  from  A  (e.g.,  hacker)  to  B  (e.g.,  server).  In 
this  normal  case,  the  “Source”  and  “Attacker”  will  hold  the  same  value,  while  the 
“Destination”  will  correspond  to  “Target.”  However,  if  information  were  to  flow  from  the 
server  to  the  hacker  (e.g.,  password  file  is  downloaded  by  the  hacker  from  the  server),  the 
base  event  will  report  this  case  as  “Source”  =  “Attacker”  =  B,  and  “Destination”  = 
“Target”  =  A.  This  means  that  the  server  has  become  the  “Attacker”  while  the  hacker  has 
become  the  “Target,”  which  is  obviously  not  true.  To  rectify  such  situations,  the 
“Originator”  parameter  within  the  event  schema  can  be  configured  accordingly  to  ensure 
that  “Attacker”  and  “Target”  are  pointing  to  the  correct  hacker  and  server,  respectively. 
As  explained,  both  the  “Attacker”  and  “Target”  parameters  provide  better  clarity  in  threat 
identification.  Henceforth,  the  rules-writing  covered  in  the  next  phase  will  cue  on  these 
two  parameters  rather  than  using  the  “Source”  and  “Destination”  parameters. 

In  addition,  the  wonn’s  transmission  pattern  also  suggests  a  “one-to-many” 
communication  pattern,  which  can  be  translated  to  the  possibility  of  a  network  sweep 
occurrence,  whereby  an  infected  host  will  attempt  to  communicate  with  multiple,  perhaps 
consecutive  IP  addresses,  hosts  within  the  same  network.  This  pattern  provides  another 
threat  behavior  that  ArcSight  can  be  configured  (via  a  correlation  rule)  to  cue  off  of. 


With  the  above  assumptions,  the  association  result  is  summarized  in  Table  2. 
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S/N 

WORM 

ASSOCIATED 

ASSOCIATED 

INFORMATION 

PARAMETERS 

PATTERNS 

1 

It  usually 
originates  from  a 
single  host. 

-  “Attacker”  &  “Target”  IP 
Addresses,  Port  and  Zone 

-  “Attacker”  &  “Target”  Port 

-  “Attacker”  &  “Target”  Zone 

- 

2 

It  has  the  ability 
to  self-replicate. 

- 

-  Internal  network  sweep 

-  Outbound  network  sweep 

Table  2.  Worm  association  table. 


F.  PHASE  5  -  CONFIGURE  RULES 

Prior  to  writing  new  rules  for  the  worm  outbreak  scenario,  it  is  important  to  know 
that  ArcSight  ESM  provides  a  Standard  Content  package  that  comes  pre-installed  within 
the  ArcSight  Console  (Figure  42).  There  are  a  series  of  “out-of-the-box”  resources  (e.g., 
filters,  rules  and  dashboards)  that  address  common  security  and  management  tasks  and 
that  are  designed  to  help  new  users  quickly  deploy  ArcSight  [10]. 
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Figure  42.  Standard  Content  for  rules  and  filters. 
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In  this  phase,  some  of  the  resources  in  the  Standard  Content  will  be  utilized  and 
explained.  With  an  understanding  on  how  these  resources  perform,  SIEM  operators  will 
be  able  to  leverage  the  given  resources  to  either  modify  (improve)  existing  rules  or  create 
new  rules  that  may  be  much  better  tailored  to  the  operator’s  own  network/system 
environment.  The  following  sections  illustrate  the  steps  to  create  a  rule. 

1.  Step  1  -  Create  “Worm  Outbreak  Test  List”  Active  List 

As  mentioned  earlier,  an  active  list  can  be  used  as  a  kind  of  ad  hoc  “bulletin 
board”  to  specify  fields  of  event  data  for  correlation  or  monitoring  purposes.  For  this 
scenario,  an  active  list  called  “Worm  Outbreak  Test  List”  (Figure  43)  will  be  created  to 
hold  the  information  of  the  internal  infected  hosts  once  the  worm  outbreak  detection  rules 
have  been  triggered.  This  active  list  will  be  useful  for  the  next  phase. 
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Figure  43.  Create  “Worm  Outbreak  Test  List”  Active  List. 
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2.  Step  2  -  Create  “Worm  Outbreak  Test  Rule” 

This  is  the  main  rule  that  will  be  responsible  for  correlating  the  base  events  and 
detecting  the  presence  of  any  worm-infected  hosts.  SIEM  operators  would  create  the  rule 
using  the  associated  parameters  identified  previously.  For  ease  of  configuration,  the 
existing  filters  “Internal  to  Internal  Events”  and  “Outbound  Events”  (Figure  44)  from  the 
Standard  Content  will  be  utilized  to  identify  the  associated  patterns  in  this  rule.  Both 
filters  are  meant  to  track  events  that  flow  from  a  private  network  (e.g.,  lO.O.x.x)  to 
another  private  network  or  external  (and  non-private)  network. 
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Figure  44.  Filters  from  Standard  content. 


The  detection  rule  will  be  created  to  look  for  a  single  host  trying  to  communicate 
with  at  least  10  other  hosts  on  the  same  target  port  within  a  given  network  within  one 
minute.  To  achieve  this,  the  logical  configuration  to  correlate  the  base  events  is  as  such: 
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LOGICAL 

COMPARISON  &  ACTIONS 

CONFIGURATION 

Conditions 

Every  base  event  shall  be  checked  and  ensured  that: 

•  the  associated  parameters  contain  a  certain 

value(s)  for  comparison,  and. . . 

(Figure  45.  ) 

•  the  correlated  event  does  not  already  exist  in  the 
active  list,  and... 

•  the  traffic  mimics  a  network  sweep  pattern  (e.g., 
Internal  to  Internal  or  outbound  network  sweep). 

Aggregation 

Correlate  10  base  events  if  its: 

•  “Attacker”  IP  address  and  zone  and  “Target”  port 

(Figure  46.  ) 

remains  the  same,  or. . . 

•  “Target”  IP  address  and  zone  is  unique. 

Actions 

If  the  conditions  are  met,  and  the  aggregation  minimum  is 
reached,  the  applicable  base  events  will  be  considered 
“correlated”  to  worm  attack  behavior,  and  these  actions  will  be 
taken: 

•  Add  the  “Attacker”  IP  address  and  zone  and 

(Figure  47.  ) 

“Target”  port  to  the  active  list  (if  not  already  in 
the  list). 

•  Set  the  “Priority”  to  10. 

•  Set  the  “Category”  parameters  to  be  worm- 
related. 

Table  3.  Logical  Configuration  for  “Worm  Outbreak  Detection  Test  Rule.” 
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Figure  45.  Worm  outbreak  detection  conditions. 
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Figure  46.  Worm  outbreak  detection  aggregation. 
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Figure  47.  Worm  outbreak  detection  actions. 


3.  Step  3  -  Test  rule  in  “Rules  Channel” 

After  the  rule  has  been  created,  the  SIEM  operator  can  test  the  rule  using  the 
“Rules  Channel.”  This  channel  allows  the  rules  to  be  imposed  on  a  fixed  time  window 
with  historical  events  and  prevents  any  conflicts  made  to  the  real-time  rules  deployed  for 
the  “Live  Channel.”  A  SIEM  operator  will  usually  undergo  a  few  iterations  of  trial-and- 
error  in  steps  2  and  3  prior  to  obtaining  the  desired  result.  The  following  shows  the 
expected  (desired)  result: 
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Figure  49.  Correlation  events  from  rule. 
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4.  Step  4  -  Apply  rule  to  “Live  Channel” 

Once  the  rule  has  been  verified  to  be  working,  the  next  step  will  be  to  apply  the 
rule  to  the  “Live  Channel”  as  follows  (Figures  50  and  51). 
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With  the  rule  deployed  to  the  “Real-time  Rules”  folder,  all  the  replay  events  from 
the  replay  connector  will  now  be  examined  to  detect  the  presence  of  internal  worm- 
infected  hosts.  The  expected  results  are  as  such: 
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Figure  52.  Infected  hosts  detected  in  “Worm  Outbreak  Channel.” 
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Figure  53.  Infected  hosts  information  in  “Worm  Outbreak  Test  List.” 


G.  PHASE  6  -  LIST  INFECTED  HOSTS 

This  phase  will  focus  on  using  the  graphical  resources,  in  particular  the 
dashboard,  to  list  the  infected  hosts  once  they  have  been  detected  by  the  rule.  The 
dashboard  will  consist  of  several  data  monitors  that  are  supplemented  by  a  filter.  The 
filter  is  responsible  for  eliciting  and  displaying  the  events  of  interest.  An  incident  report 
will  be  created  to  provide  summaries  of  the  data  pertaining  to  the  infected  hosts.  The 
steps  are  as  follows: 

1 .  Create  “Worm  Outbreak  Filter  List.” 

2.  Create  “Worm  Infected  Test  Monitor.” 

3.  Create  “Worm  Propagation  by  Host  Test  Monitor.” 

4.  Create  “Worm  Propagation  by  Zone  Test  Monitor.” 

5.  Create  “Worm  Outbreak  Test  Dashboard.” 

6.  Create  “Worm  Outbreak  Test  Report.” 

1.  Step  1  -  Create  “Worm  Outbreak  Filter  List” 

The  reason  for  creating  the  “Worm  Outbreak  Test  List”  in  the  previous  phase  was 
to  provide  the  information  of  the  worm-infected  hosts  as  a  rule  input  to  yet  another  rule. 
This  other  rule  will  be  used  by  the  data  monitors  to  create  a  list  of  all  infected  hosts. 
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2.  Step  2  -  Create  “Worm  Infected  Test  Monitor” 

This  data  monitor  displays  the  number  of  hosts  that  have  been  infected  in  the 
course  of  a  worm  outbreak. 
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Figure  55.  “Worm  Infected  Test  Monitor”  configuration. 
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Figure  56.  “Worm  Infected  Test  Monitor”  expected  display. 


3.  Step  3  -  Create  “Worm  Propagation  by  Host  Test  Monitor” 

This  data  monitor  shows  the  spread  of  worm  activity  throughout  the  network. 
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Figure  57.  “Worm  Propagation  by  Host  Test  Monitor”  configuration. 
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Figure  58.  “Worm  Propagation  by  Host  Test  Monitor”  expected  display. 


4.  Step  4  -  Create  “Worm  Propagation  by  Zone  Test  Monitor” 

This  data  monitor  shows  the  spread  of  worms  across  network  zones. 
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Figure  59.  “Worm  Propagation  by  Zone  Test  Monitor”  configuration. 
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Figure  60.  “Worm  Propagation  by  Zone  Test  Monitor”  expected  display. 

5.  Step  5  -  Create  “Worm  Outbreak  Test  Dashboard” 

Prior  to  displaying  the  content,  the  data  monitors  must  be  attached  to  the 
dashboard. 
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Figure  62.  “Worm  Outbreak  Test  Dashboard”  expected  display. 
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6.  Step  6  -  Create  “Worm  Outbreak  Test  Report” 

Similar  to  the  data  monitors,  a  pdf  report  will  be  generated  based  on  the 
information  captured  in  the  “Worm  Outbreak  Test  List.”  The  report  will  consist  of  a  table 
that  provides  the  information  of  the  infected  host  and  a  chart  that  shows  the  number  of 
times  the  infected  hosts  have  been  detected  (i.e.,  matched  against  the  Worm  Outbreak 
Test  Rule). 
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Figure  63.  “Worm  Outbreak  Test  Report”  attributes  configuration. 
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Figure  64.  “Worm  Outbreak  Test  Report”  template  configuration. 
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Figure  65.  “Worm  Outbreak  Test  Report”  table  data  configuration. 
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Figure  66.  “Worm  Outbreak  Test  Report”  chart  data  configuration. 
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Figure  67.  “Worm  Outbreak  Test  Report”  expected  display  page  1 . 
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Figure  68.  “Worm  Outbreak  Test  Report”  expected  display  page  2. 
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H.  SUMMARY 


This  chapter  has  elaborated  on  the  training  methodology  and  provided  details  on 
how  to  utilize  the  comprehensive  resources  within  ArcSight  to  detect  the  occurrence  of  a 
worm  outbreak  threat  situation.  The  structural  approach  has  been  categorized  into 
procedural  phases  and  is  summarized  as  such: 

•  Scenario  briefing  -  Provide  the  background  context  of  the  training 
scenario. 

•  Phase  1  -  Conduct  a  study  on  the  computer  worm  and  understand  its 
behavior. 

•  Phase  2  -  Set  up  the  replay  connector  with  the  wonn  outbreak  replay  file. 

•  Phase  3  -  Construct  a  network  topology  using  the  parameters  from  the 
events  provided  by  the  worm  outbreak  replay  file. 

•  Phase  4  -  Associate  wonn  behavior  to  the  parameters  of  the  events. 

•  Phase  5  -  Configure  rules  and  filters  to  detect  and  elicit  information  about 
the  infected  hosts. 

•  Phase  6  -  Present  the  information  of  the  infected  hosts  using  monitors  and 
dashboard  and  create  a  summarized  report  of  the  incident. 

In  addition  to  the  explanation,  the  expected  results  (e.g.,  screenshots)  are  also 
included  to  enhance  the  understanding.  The  next,  and  final,  chapter  will  conclude  this 
thesis  by  presenting  the  findings  and  proposing  additional  areas  for  future  research. 
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VI.  CONCLUSION 


A.  FINDINGS 

1.  Importance  of  Source  Events 

SIEM  products  are  passive  systems  that  do  not  possess  the  capability  of  detecting 
or  analyzing  malicious  payload.  The  products  rely  on  information  culled  from  the  various 
log  reports  collected  from  the  various  reporting  devices.  The  reported  information  may  be 
highly  specific  if,  for  example,  the  reporting  device  is  a  security  system  (e.g.,  a  host- 
based  IPS)  that  “intelligently”  analyzes  traffic  and/or  end-device  events.  The  reported 
information  may,  on  the  other  hand,  provide  only  a  “hint”  about  a  threat  (e.g.,  a  failed 
connection  attempt)  and  require  clever  correlative  rules  writing  on  the  SIEM  itself  in 
order  to  deduce  an  actual  threat.  Threats  (e.g.,  virus)  that  only  harm  the  infected  host 
(e.g.,  deletion  of  system  files)  and  do  not  display  any  distinctive  network  traits  may  be 
able  to  sneak  past  the  watchful  SIEM  products  if  no  device  reporting  to  the  SIEM 
provides  any  tell-tale  log  infonnation  related  to  that  threat.  As  such,  it  is  important  for 
SIEM  operators  to  understand  this  causal  relationship  and  emphasize  the  proper 
configurations  at  the  source  devices. 

2.  “Reverse-Engineering”  Inadequacy 

As  demonstrated  in  Chapter  IV,  a  simplified  network  topology  can  be  constructed 
using  the  steps  mentioned.  However,  it  is  crucial  to  know  that  the  “reverse-engineering” 
method  is  non-intuitive  and  will  require  certain  assumptions  to  justify  the  sequence  of 
transmission  flow.  In  addition,  the  reconstruction  method  is  not  “perfect”  due  to  the 
following: 
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S/N 

INADEQUACY 

EXAMPLE 

1 

Events  captured  in  the  replay 
file  may  not  reveal  the  entire 
configuration  of  the  reporting 
device. 

A  router  with  4  used  interfaces  will  typically  have 

4  IP  addresses.  However,  if  the  replay  file  only 
contains  transmission  records  involving  2  of  these 
interfaces,  the  IP  addresses  of  the  other  2  would 
never  be  reported. 

2 

The  actual  number  of  network 
devices  and  hosts  identified 
may  differ  from  the  actual 
topology. 

When  a  base  event  shows  that  a  transmission 
occurs  between  two  hosts,  it  does  not  necessarily 
mean  that  the  destination  host  is  physically 
available.  The  base  event  can  only  ascertain  the 
existence  of  a  source  host  that  attempts  to 
communicate  with  the  destination  host. 

3 

The  replay  file  may  not 
contain  events  that  report 
existing  devices  in  the  actual 
topology. 

A  host  connected  to  the  reporting  device  may  not 
have  any  transmission  with  other  devices.  As  a 
result,  the  replay  file  will  not  contain  any  events 
that  will  provide  information  on  the  presence  of 
the  host. 

Table  4.  Reconstruction  method  inadequacies 


Despite  this,  the  “reverse-engineering”  method  is  still  important  in  the  absence  of 
the  actual  network  topology  for  the  context  of  this  thesis. 

B.  ADDITIONAL  AREAS  FOR  FUTURE  RESEARCH 
1.  Create  more  Scenarios  with  Different  Threats 

This  thesis  has  illustrated  only  the  structural  approach  in  tackling  one  cyber¬ 
threat.  There  are  many  other  cyber  threats  that  could  be  “captured”  as  replay  files,  some 
of  which  could  be  quite  complex,  and  thus  requiring  more  effort  in  tailoring  the  specific 
rules  and  filters  to  attain  some  desired  level  of  student  learning.  By  working  on  other 
training  scenarios,  SIEM  operators  would  be  able  to  gain  further  insights  on  the 
commonalities  amongst  various  threats  and  better  understand  the  utilization  of  the 
resources  provided  by  ArcSight  ESM  in  deriving  a  more  optimal  solution. 
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2.  Explore  Other  ArcSight  Resources 

In  addition  to  the  resources  mentioned,  ArcSight  also  provides  other  resources 
that  are  useful  in  profiling  the  entities  within  the  network.  For  example,  the  “Actor” 
resource  can  be  used  to  model  the  user  and  administer  his/her  associated  security 
classification;  which  is  useful  for  detecting  insider  threats.  The  “Use  Case”  resource 
provides  a  way  to  view,  configure,  and  transport  specially  developed  sets  of  related 
resources  that  address  specific  security  issues  and  business  requirements.  When  used 
effectively,  this  resource  can  be  translated  into  different  scenarios  for  multi-level  training 
purposes. 

C.  CONCLUSION 

Computer  threats  have  constantly  evolved  over  the  years  and  security 
practitioners  are  finding  it  harder  to  keep  in  pace  with  this  worrying  trend.  As  explained 
in  Chapter  II,  issues  exacerbating  the  situation  can  be  attributed  to  the  lengthy  time 
required  in  training  qualified  and  competent  SIEM  operators,  as  well  as  the  complexity  of 
the  tools  that  can  be  used  to  detect  the  presence  of  such  threats. 

In  the  face  of  such  problems,  this  thesis  has  focused  on  developing  a  cyber¬ 
forensics  training  methodology  that  aims  to  educate  SIEM  operators  on  the  effective 
approach  to  monitoring  and  reacting  to  cyber  threats.  It  incorporates  a  repeatable  SIEM 
educational  framework  and  highlights  the  structural  approach  in  utilizing  ArcSight  ESM 
to  collect,  analyze  and  detect  malicious  events.  The  framework  has  been  empirically 
demonstrated  with  a  worm  outbreak  lab  scenario,  where  the  captured  results  illustrate  the 
feasibility  of  the  proposed  approach.  Through  the  methodology  presented,  it  is  hoped  that 
the  rate  of  training  competent  SIEM  operators  might  be  increased  so  as  to  keep  up  with 
the  growing  risks  of  operating  in  cyber  space. 
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APPENDIX.  REPLAY  CONNECTOR  INSTALLATION  GUIDE 


This  installation  guide  illustrates  some  of  the  parameters  to  be  configured  and 
provides  screenshots  to  aid  the  installation  process.  Two  files  are  required  in  this  setup: 
“WormOutbreak. events”  and  “ArcSight_Connector_Installer.exe.” 

First,  install  “ArcSight_Connector_Installer.exe”  on  a  host.  For  ease  of 
management,  you  can  install  the  connector  on  the  same  host  that  contains  the  ArcSight 
Console.  The  following  figures  show  screenshots  of  the  installation  process. 


Figure  69.  ArcSight  Connector  Installer  screenshot. 


Figure  70.  “Choose  Install  Folder”  screenshot. 
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Figure  71 .  “Choose  Install  Set”  screenshot. 


Figure  72.  “Choose  Shortcut  Folder”  screenshot. 


Figure  73.  “Pre-Installation  Summary”  screenshot. 
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After  the  installer  has  copied  all  the  working  files  to  the  selected  folders,  a 
separate  “Connector  Setup”  screen  will  appear.  Select  “Add  a  Connector”  and  continue  to 
the  next  screen. 


<  Connector  Setup 


Configure 


What  would  you  like  to  do? 


ig)  Add  a  Connector 
©  Enable  FIPS  mode 


<  Previous  [  Next  >  |  |  Cancel 


Figure  74.  “Add  a  Connector”  screenshot. 


In  this  screen,  you  must  select  the  “Test  Alert”  option  from  the  list  of  ArcSight 
SmartConnectors“Connector  details”  (Figure  75.  ). 


<  Connector  Setup 


Configure 


Select  the  connector  to  configure 


[  <  Previous  |  |  Next  >  |  [  Cancel 


Figure  75.  “Connector  options”  screenshot. 
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Figure  76  shows  the  parameters  that  control  the  events  flow  to  the  ArcSight 
Express.  Their  definition  can  be  seen  by  mousing  over  their  text  label  (Figure  76).  Some 
of  the  more  important  parameters  to  take  note  are  as  follows: 

•  maxrate  &  eventrateunit  -  These  two  parameters  control  the  maximum 
rate  at  which  events  will  be  sent  to  ArcSight  ESM.  For  initial  learning 
purpose,  the  rate  shall  be  slow  so  that  it  allows  ample  time  for  analysis. 

•  setdetecttimeasnow  &  setagenttimeasnow  -  These  two  parameters  shall  be 
set  to  “true”  so  that  the  timestamp  of  the  base  events  can  be  synchronized 
with  ArcSight  Console. 

•  randomizeratetime  -  This  parameter  introduces  randomness  to  the  base 
events  received.  It  shall  be  set  to  “0”  to  prevent  the  confusion  during  the 
analysis  of  the  base  events. 

•  Uienabled  -  This  parameter  must  be  set  to  “true”  for  the  replay  GUI  to 
appear. 


<  Connector  Setup 


Configure 


ArcSight^; 


Modify  simple  parameters 

maxsleeptimie 
timefactor 


Maximum  number  of  seconds  to  wait  between  alerts 
frue  ▼ 


false  ▼ 
false  ▼ 


continuous 
maxrate 

setdetecttimeasnow 
setagenttimeasnow 
preservedetecttime 
preserveagenttime 
randomizeratetime 
contcachesize 
eventrateunit 
overridearcsightcategory  false 
startpaused  true  ▼ 

ovemdezoneinfo  false  ▼ 

autoload 
loadail 
uienabled 


Second  ▼ 


false  ▼ 
false  » 


Cancel 


Figure  76.  “Connector  details”  screenshot. 


Since  the  base  events  must  be  sent  to  ArcSight  Manager  for  analysis,  the 
“ArcSight  Manager  (encrypted)”  option  shall  be  selected  as  the  destination. 
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Figure  77.  “Select  destination”  screenshot. 

Depending  on  how  the  ArcSight  Console  has  been  configured,  the  “Manager 
Hostname,”  “User”  and  “Password”  fields  will  be  filled  with  the  respective  values  that 
are  used  to  log  into  the  ArcSight  Console.  The  rest  of  the  parameters  shall  be  set  to  false. 


Figure  78.  “Destination  parameters”  screenshot. 

The  respective  values  will  be  used  to  uniquely  identify  the  connector. 
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Figure  79.  “Connector  details”  screenshot. 

If  the  values  in  Figure  78  have  been  entered  correctly,  the  connector  setup  will 
attempt  to  import  the  certificate  from  the  ArcSight  Manager. 


<  Connector  Setup 


0 


Following  certificate  will  be  imported  into  connector  trust  store: 
Host/port: 

Details:  CN=  ,  OU=ESM,  CK  ,  L= 

ST=  ,  C= 


o  Import  the  certificate  to  connector  from  destination 
0  Do  not  import  the  certificate  to  connector  from  destination 


ArcSight*^ 


in 


Figure  80.  “Import  Certificate”  screenshot. 
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Figure  8 1 .  “Add  connector  Summary”  screenshot. 


Figure  82.  “Install  as  a  service”  screenshot. 
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Figure  83.  “Service  parameters”  screenshot. 


Figure  84.  “Exit”  screenshot. 


Figure  85.  “Install  complete”  screenshot. 
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Once  the  replay  connector  has  been  installed  successfully,  create  a 
‘ReplayEvents”  folder  within  the  ArcSight  SmartConnector  folder,  and  copy  the 
“WormOutbreak.  events”  file  into  it. 


0 


OQri  «  Program  Files  (x86)  ►  ArcSightSmartConnectors  ►  current  ►  ReplayEvents 


▼  1^11  Search  ReplayEvents 


Organize  ▼  Include  in  library  ▼  Share  with  ▼  New  folder 

8H  -  □  # 

ift  Downloads 
•  Dropbox 
^1  Recent  Places 

Name 

Date  modified  Type 

Size 

1  ,  Worm  Outbreak,  events 

8/6/2006  6:49  AM  EVENTS  File 

2,435  KB  | 

Libraries 

2]  Documents 
Music 

S  Pictures 
$  PPTV&551 

Q  Videos 

E 

Homegroup 

Computer 

&  Windows7_OS  (C 
r-a  Local  Disk  (D:) 

1  item 

Figure  86.  “ReplayEvents”  folder  screenshot. 


The  parameters  of  the  replay  connector  (Figure  76)  can  be  modified  subsequently 
by  typing  “runagentsetup”  in  the  command  prompt  as  shown  in  the  following: 


|  Administrator  Command  Prompt  -  runagentsetup 


..lcrosoft  Windows  [Uersion  6.1.7601  J 

Copyright  <c>  2009  Microsoft  Corporation,  fill  rights  reserved. 


(=)  23 


C:\Windows \system32>cd. . 

3:\Windous>cd. . 

C:\>cd  program  files  <x86)\arcsightsmartconnectorsScurrent\bin 

i:\Program  Files  <x86>\ArcSightSmartConnectors\current\bin>runagentsetup 

Assuming  ARCSIGHT_HOME:  C:\Program  Files  <x86)\ArcSightSmartConnectors\current 
Assuming  JAUA_H0ME:  C:\Program  Files  <x86>\ArcSightSmartConnectors\current\jre 

ArcSight  Agent  Setup  starting...  . - 

Connector  Setup  Wizard  starting  in  mode  [SWING] 

[Thu  Aug  15  17:56:58  PDT  2013]  [INFO  ]  Checking  for  a  running  instance  of  connec 
tor _ 


<  Connector  Setup 


s 


What  would  you  like  to  do? 


®  Modify  Connector 
g'  Uninstall  as  a  service 
0  Enable  FIPS  mode 


ArcSi^ht^C 


<  Previous  Next>  Cancel 


Figure  87.  Modify  replay  connector’s  configuration. 
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